# A Siri like AI Assistant * Uses ChatGPT (or alternative LLM) for general queries * Uses Wolfram Alpha API for anything math related * Has built in NLP (using a NLI model) for determining if we can process query locally (skills system) * Frontend/Backend architecture for ability to deploy lightweight clients ## Skills - [ ] Translations - [ ] Alarms (potentally complete, if we use Timers logic) - [ ] Calendar - [ ] Gmail - [ ] ChatGPT - [ ] Reminders - [x] Timers - TODO: Adding in sound notifications. - [ ] Todos - [ ] Weather - [ ] Wolfram - [x] NLP - [x] Speech to Text (frontend for sure) - [x] Phone - [x] inital implementation where the number is sent to the phone - [ ] NLP name to check contact - [ ] iCloud Contact API - [ ] API - [ ] Authentication - [ ] General API - [ ] TTS - generate audio on backend or frontend? - Perks of backend is fast generation - Cons of backend is large file transfers between devices, lots of internet usage - Perks of frontend is less data transfer between devices requiring less internet usage - Cons of frontend is slower generation - Current Solution: https://github.com/synesthesiam/opentts - Currently hosted instance: [tts.imsam.ca](https://tts.imsam.ca) ## API Specs Using websockets for communication allows for two way communication where the server can send the client info at any point Link for example: https://stackoverflow.com/questions/53331127/python-websockets-send-to-client-and-keep-connection-alive More examples (includes jwt authentication, though this is in node.js, still useful for figuring out how to do this stuff): https://www.linode.com/docs/guides/authenticating-over-websockets-with-jwt/ ## Ideas * Dashboard with api call counts (would require linking into all active skills, callbacks with class inheritance maybe?) * Phone calls from Jarvis speaker * JARVIS, initiate the House Party Protocol (takeover screen and show retro style text interface, possibly showing data from dashboard) ## Wants, but limitations prevent * *tumble weed bounces by* Oh, dear.