LSTM Based Text Generation

After finishing the Deep Learning specialization on Coursera, I was intrigued by one of the programming exercises in the course Sequence Models. This is the course where we learn about RNN or Recurrent Neural Networks, a powerful way of analyzing and generating text and music. The specific assignment involved making up unique dinosaur names character by character. It was a fun exercise for sure.

At the end of the course, I wanted to build my own text generator based on a dataset. I also wanted to generate text on a word by word basis and not on a character basis. Furthermore I wanted to do some hands-on programming for AI/ML outside the confines of the assignments in the deep learning course.

I firstly wanted to setup my ML environment with the necessary software. Hardware wise, I was covered (perks of being a gamer :) Ryzen 5 1600, 16GB RAM, NVIDIA 1070). I looked at a few blogs and tried going the Linux route but honestly, I'd spent almost the whole day trying to get the Anaconda and  CUDA environment to work properly on it and I was just frustrated. Then I decided to go the Windows route (my default OS) to see if it was any easier. I found this blog and followed the instructions and voila! I was all set within a couple of hours.

Let's move on to the main project then.

I used the machine learning mastery blog as a starting point. In that project however, the author didn't use word embeddings and left it open as an extension to the project. The author however, did write another post where embeddings were used (but not for the same idea) so I combined the knowledge from both posts to make my own text generator.

Here are some details of the project:

Input: Just like the blog, I used the Republic by Plato as the input data. I used a line length of 15 words as the input to the model.

Model: I used a 3 layer 45 node bi-directional LSTM (that used the CuDNN framework) and two dense networks. The blog author used a simpler 2 layer plain LSTM model (with 100 nodes). I found the training of the bidirectional LSTM model to be faster than that of the plain LSTM model. I trained for around 60 epochs with a batch size of 128. I reached an accuracy of 52%. I used sparse categorical cross-entropy for the loss instead of the categorical one. Adam optimizer was used.

Output: The last LSTM doesn't return sequences so we get one word as the output. This output is then appended to a seed input and then re-fed to the model to predict the next word.

Without further ado, here are some generated sentences - 


good and the unjust man who is the discernment of oligarchy and the cognate and uphill and the melody and harmony of individuals who may be made to be a arithmetical ones of them are to be destroyed by the soul of pauper and they were in the country or hymning the aid of the earth and the medicines

and others in the intermediate day are men who have a perfectly view lower and gratifying truth true he said once to be a discovery of egypt we shall be brave to lay the price of the discussion as they attain to the laws and have a common name courage and gymnastic and rock which we exercise in the mire

his difference he said the number of the shadows whose desires are to lose his nurture compels us to be the defenders of the state and the desires of the soul to exhibit the ears of the earth and desires them to learn and fasten the former division they will not be approved and they will have a common training

years after pride of his brothers lysias and euthydemus the souls were iron in time with garlands their will being to see him and escape in the way of ambition and the intellect relax nor says that he was being initiated by us to him and will be impaled and is fixed by informers and cut a journey old instances

 they are beaten and make a fortune from the subtleties of the soul but there are many times more forms honourable and what is the universal tyrant which contains the oligarchical democratical last and the olympic victor the oligarchical mass of men and children and the things speak of the same athletes is the most miserable of evil

the unjust is easily more profitable to sound that which is the ruin of the bed and unjust as the oligarchical democratical i said that we are telling you because you speak a second class or the greater about the soul of the state which he has done the irrational and cherisher of being or knows he is happy

likely to have the other but will you tell me of it i cannot help fearing adeimantus i replied that i could pay if i am perplexed he adeimantus no one was agreeable to maintain that the best there is a difficulty in behold as you may call them to whom i bore the voices of the relative happiness

he is fastened to the sweet beast and fed the soul of hellas is made to be saved and gains he will not be rejected by him and temperance certainly claim to be the judge of the responsibility of peace in their education and death given to the state true and the result is not the reason why ridiculous that


It was quite interesting to read the generated text. It's grammatically correct gibberish in most of the cases with some tiny bits of wisdom thrown in infrequently.

Final thoughts: This was a fun project to start my journey into ML/AI. Ill keep updating this blog with more projects and relevant information.

P.S. I didn't post any code as I felt it was too similar to the blog post I used to make this project. Original/unique code will definitely be posted.

Comments

Popular posts from this blog

First ever Kaggle competition - Pneumonia Detection

Densenet, Pneumonia Detection, Activation Maps and AUROC

Simple GUI for Model Inferences