This is the third blog in the series see the other parts here:
False starts and dead ends
So last time I had trained the GPT-Neo with text from art of war, and lead to shit results. So I proceeded to improve the data. I tried on lines instead of paragraphs it did nothing.
Then thought most of the text in the book is irrelevant, so I trained on just the quotes, where I ran into error because the training the data wasn’t sufficient.
Then I went back to GPT-3, and created the prompt in the following format, where “Sun Tzu says:” was followed by a quote from art of war
Then I asked GPT-3 to complete the text with a random prompt. This surprising gave very good result
Creating the bot
Next part was tying it to the twitter API to make the post. I was expecting this part of to be difficult, it wasn’t.
Twitter gave the link to a Glitch App, which handled the OAuth for me, and I copied the token and used it. Setting up a backend just for OAuth would have been annoying.
Saasing it up with an AI profile pic
AI Sun Tzu just doesn’t need to be wise, he also need to look wise (and old but modern).
This is the twitter profile of our prolific AI Sun Tzu, and the both the profile picture and the cover ofc were created using AI.
How did I make it tweet regularly
GitHub actions for the win 💪🏻
Conclusion
None of the alternative NLP models I tried were as good Cohere, GPT-3, GPT-Neo, or the Jurasic models by ai21.com.
Finetuning bad model with good data is not helpful. Generally I couldn’t finetune the model to a usable quality. It is probably worth trying again.
IDK what else to conclude, overall it was a lot of fun!
Breadcrumbs
(Breadcrumbs are list of random discoveries and resources which I came across along the way)
forefront.ai → They have openAI like playground for opensource models and bunch of APIs for fine tuned models. SImilar companies:
https://nlpcloud.com → Give CPU/GPU on Open Source AI models
https://riku.ai
The style of Japanese paintings I really like is called Ukiyo-e
Text generation libraries that I came across:
https://github.com/jsvine/markovify — Uses markov chains, doesn’t look like it uses large language models.
https://github.com/minimaxir/aitextgen — Took tremendously long to train, produced shit results, maybe it will work better with a better quality data or maybe when larger open-souce models like Bloom will be available.
Other misc things:
https://m3o.com/ → A collection of APIs to use for your projects, looks interesting, didn’t use.
https://transitivebullsh.it/saasify-key-takeaways → Blog post by a guy who build a tool to help people create micro-saas products. Was intersting because it kinda overlaps with another project I was thinking of.
TBH I feel his conclusions are wrong, he shut down the company because he failed to raise VC, and he also shared the feedback shared by VCs. VC feedback doesn’t mean much.
https://github.com/fireship-io/gpt3-twitter-bot → Another GPT-3 bot code, his video gave me the idea on how to design the prompts.
https://deepai.org/apis
They have an AI API Marketplace
There are a lot of APIs on the marketplace and some look pretty good, for example the text summarization API is really good.
Ideas from the project:
Have an extension which GitHub repos as “Setup working” so that you know if you can actually get this code up an running. Ran into bunch of repos which looked interesting but there their readup was outdated/incomplete.
If we can do this automatically, that will be awesome!
Create a twitter bot “animals loving” which retweets photos of animals loving each other.
Create a twitter account “mid-wit-meme” which retweets all the mid-wit memes tweeted by people. Why? Because I love mid-wit memes.