How to Make Your Own Audio Spaced Repetition Language Learning Files for Free

Stuart Jay Raj

on Apr 26, 2019

By the end of this article, you will have all the tools available to automatically generate your own customised multi-lingual audio, spaced-repetition vocabulary learning training files as well as physical flash cards to help you increase your vocabulary in the language that you are learning and develop a much more natural level of fluency speaking the language.

You can watch my Webinar tutorial above, where I present this article's information. While the video focuses on Thai, the principles that I talk about and show you in this article and in the video can be done for any language – Chinese, Hindi, Japanese, Spanish, Russian. I would advise reading this whole article first, then watching how I implement it in the video.

Use All the Tools You Can Lay Your Hands on To Learn

When I was about five years old, I used to sit for hours in awe with my grandfather tuning into shortwave radio broadcasts from all over the world. He would tune into languages both that he knew and that he didn’t know and we would talk about the accents and words we heard and he would show me the connections to words, sounds and phrases that I already knew. We would even listen to foreign broadcasts in Morse code. Shortwave radio was one of the most powerful tools that he had used over the years to help him become fluent in over eleven languages. For him, the advent of the cassette recorder was a game changer. Now he could (and would) record radio broadcasts and native speakers of languages speaking as well as himself speaking all in the name of improving his own ability in the languages that he spoke.

If my grandfather was alive today, I can imagine how much fun he would have putting all the tools that we have now to good use. In order to be able to learn the way he learnt, he had to develop a certain level of technical expertise. His expertise in radio and electronics was a legacy of his job during World War II. He never stopped learning about new technology however, teaching himself to code BASIC and Assembly Language on his Commodore 64 and other early computing devices in the 80’s and 90’s, and continued to learn right up until his passing in 1997.

I learnt that through developing some degree of skill in using and manipulating technology gave me a big advantage when it came to language learning. I didn’t have to depend on what already existed out there. If it didn’t exist, then I could build my own. One thing that I have learnt over the years is that it is in the creating and building process that most of the learning happens.

Right now we have technological tools that would have really blown anyone from just twenty years ago away. With my phone alone I can record people’s voices including my own and then play it back at normal or slower speed rates to analyse and learn. I can use voice to text recognition software. I can see pitch contours and voice prints of my voice and take videos of native speakers speaking. I can use Google’s translate app that even lets me point my camera at signs and see the translated text appear on signs which I can then take screen shots of and store in my ‘sign’ database. I can speak with people from every country on the planet and learn from them just as easily as calling my next door neighbour.

While I use all of the above mentioned techniques to help me learn language, today I would like to share with you one technique that I have been using lately that for me is a real game changer.

How to Get a Learning Edge

If you want to get an edge in your language learning strategy, I highly recommend starting a love affair with the terminal on your computer. I used to be an ardent Windows user for many years, but since I switched over to Mac OSX a few years back, I can’t see myself going back to Windows voluntarily anytime soon. One of the main reasons for me is the power I have through the terminal command line in manipulating language and text to help me learn languages. If there is one thing you take away from this article, its ‘don’t be afraid of the terminal’. As I’m sure any Linux users out there can testify, the terminal is a wonderful thing.

So why all this talk about the terminal? In the BASH shell used on OSX and Linux, there are many very powerful tools that are built into the system that allow you to play with and manipulate text. I’ll show you what I mean in a moment.

Before we look at manipulating text to help us build new vocabulary up in a new language, let’s look at another amazing tool we have to play with if you are using OSX.

If you have a Mac, open up the terminal window. To open the terminal, open the Spotlight search window and type:
terminal

Now type this (make sure you have the volume turned up):
say “I love learning languages.”

Amazingly, your computer will say “I love learning languages”. The accent may vary depending on what default voice you have set.

You can change how fast the voice speaks by using the ‘-r’ switch.

say –r 200 “I love learning languages”

Will say “I love learning languages” at 200 words per minute.

One thing you should be noticing by now is that the voices sound amazingly authentic compared to the robot like voices we became used to back in the 90’s and early 2000’s. The prosody / rhythms of speech seem much more natural to native speaker ears.

OSX 'Speak' Text To Speech (TTS) Voice and Language List

I have my default voice set to the female Australian voice called ‘Karen’.

To hear her speak, you can change the voice using the ‘-v’ switch:

voice –v Karen “I love learning languages.”

You can see all the other voices and languages available by typing:

say –v ?

Here is a list of the languages and voices available at the moment:


Language	Voice
ar_SA	Tarik
cs_CZ	Zuzana
da_DK	Sara
de_DE	Anna
el_GR	Melina
en_AU	Karen
en_GB	Daniel
en_IE	Moira
en_IN	Veena
en_US	Agnes
en_US	Albert
en_US	Alex
en_US	Bahh
en_US	Bells
en_US	Boing
en_US	Bruce
en_US	Bubbles
en_US	Cellos
en_US	Deranged
en_US	Fred
en_US	Hysterical
en_US	Junior
en_US	Kathy
en_US	Princess
en_US	Ralph
en_US	Samantha
en_US	Trinoids
en_US	Vicki
en_US	Victoria
en_US	Whisper
en_US	Zarvox
en_ZA	Tessa
en-scotland	Fiona
es_AR	Diego
es_ES	Monica
es_MX	Paulina
fi_FI	Satu
fr_CA	Amelie
fr_FR	Thomas
he_IL	Carmit
hi_IN	Lekha
hu_HU	Mariska
id_ID	Damayanti
it_IT	Alice
ja_JP	Kyoko
ko_KR	Yuna
nb_NO	Nora
nl_BE	Ellen
nl_NL	Xander
Organ	Pipe
pl_PL	Zosia
pt_BR	Luciana
pt_PT	Joana
ro_RO	Ioana
ru_RU	Milena
sk_SK	Laura
sv_SE	Alva
th_TH	Kanya
tr_TR	Yelda
zh_CN	Ting-Ting
zh_HK	Sin-ji
zh_TW	Mei-Jia

With the exception of one or two small vowel sounds, I have found that the Kanya voice is amazingly accurate. What is even more amazing is that if the Kanya voice is told to read something in English, the result is almost perfect Tinglish! The Thai sound rules that mask English to produce ‘Tinglish’ are also applied to the input text and you get a very authentic Thai English speaking sounding voice.

So now we have a voice that is near native like that can speak to us on demand in a foreign language. Just think of the possibilities! … I have thought of many and I will share some of them with you now.

Find Vocabulary List Data

Open up a spreadsheet. I generally use Google Sheets or Microsoft Excel. Any spreadsheet will be fine, just as long as you can export to .csv (Comma Separated) format.

I will be using Thai as the target language in these examples, but you can very easily substitute any other language that has a voice available and do these same things.

The first thing I do when looking to build up my vocabulary is find good sources of vocabulary data. There are so many places you can look for this. Dictionaries, bilingual books, books that have been translated into other languages – and then buying both copies of the book, online lessons and wordlists, word frequency lists and my favourite, movie subtitle files. Movie subtitle files which normally come in .srt format are especially fascinating because all the sentences are already broken down for you and if you get the English and Language ‘X’ subtitle from the same place, chances are that they are in sync with each other sentence for sentence. You can just open the subtitles up in a free program like Subtitle workshop and cut and paste them into a spreadsheet.

I have just done a quick hunt around and I have found a wonderful set of new vocabulary items in Thai in the ‘Farang Can Learn Thai Language’ group on Facebook. This list has been put together by Thai teacher ‘Wondrous Thai’.

Now I am going to cut and paste the new vocab into the blank spreadsheet that I have open:

Screen-Shot-2019-05-01-at-10.32.36

We will just use five sentences for this example, but you can use as many as you like. Just last week I created a list of 3000 sentences using this method.

I have put my sentences into three columns in this instance as ‘Wondrous Thai’ has included her own form of transliteration. You can use as many columns as you like in these files. We will just pull the necessary data to make our language training files.

Here are the sentences in textual form for you:


Sorry I’m late.	ขอโทษที่มาสาย	kŏr tôht têe maa săai
Sorry I couldn’t help you	ขอโทษที่ช่วยคุณไม่ได้	kŏr tôht têe chûay kun mâi dâai
Sorry I have to go.	ขอโทษที่ฉันต้องทิ้งคุณไป	kŏr tôht têe chăn dtông tíng kun bpai
Sorry that I made you upset	ขอโทษที่ผมทำให้คุณเสียใจ	kŏr tôht têe pŏm tam hâi kun sĭa jai
sorry for hurting you	ขอโทษที่ทำคุณเจ็บ	kŏr tôht têe tam kun jèp

Once you have entered all the words in that you will use, click on File and Download As Comma Separated Values (.csv current sheet). I changed the default file name to a more suitable name - 'vocab.csv'.

By default it should save it in your Downloads directory. I created my own new directory called ‘vocab’ on my desktop and copied that .csv file there.

Open Your Terminal

Go to your terminal and go to the directory that you saved the file in. I created a directory called ‘vocab’ on my desktop, so I use the ‘cd’ (Change Directory) command:

cd ~/Desktop/vocab

now if you type –ls, you should see your file listed in that directory.
Screen-Shot-2019-05-01-at-10.45.45

We call this terminal environment the ‘shell’. The name of the shell we are using is called ‘BASH’ and is a UNIX shell and command language released in 1989 as a free replacement for the Bourne Shell. Both OSX and Linux use this shell.

Text Manipulation Tools in the Terminal can Work Magic! Introducing AWK and SED

There are some great shell tools that you can use to manipulate this data in the CSV file. The two main tools that I use that are built into the BASH terminal here are SED and AWK. ‘SED’ is a ‘stream editor’ and basically allows you to search and manipulate text in any file. AWK is a tool that allows you to manipulate data that is stored in table forms in some kind of text format. You can search the data and output any combination of data, columns or calculations based on that data – all from the command line.

If you wanted to see all the contents of the CSV file that you just created, you can just type:

cat vocab.csv
Screen-Shot-2019-05-01-at-10.59.30

Hint - if you tap the key after you start typing a filename, if that file exists in the directory you are in, the filename should autocomplete. If there are multiple filenames that start with the same letters, tapping twice will give you a list of the possible filename options.

Because of the font used in the terminal, some character formatting might look a little strange. Don’t worry – as long as the data in the original file is correct, you should be fine.

Now that data looks a little messy. I would like to have a list of just the first and second columns. I don’t want to see the transliteration. This is where the AWK command comes in handy. Here, $1 stands for ‘Column 1’ and $2 stands for ‘Column 2’. The –F, means that a comma is the Field Separator we are using to break up columns.

awk -F, ‘{print $1, $2}’ vocab.csv
Screen-Shot-2019-05-01-at-11.08.58
Now what if I just wanted the third sentence output for both the English and Thai?

Care for a Pipe? |

In bash scripting, we have a wonderful tool called a ‘pipe’ |. Using the ‘|’ character, we can pipe the output of anything into any other program available to us. Now I will create my custom list that we have here done by AWK and then pipe it to the SED program to select the third line for me:

awk -F, ‘{print $1, $2}’ vocab.csv | sed -n 3p
Screen-Shot-2019-05-01-at-11.15.02
Now we are getting somewhere. Just imagine if we could get the computer to say these sentences for us too!

Now to show you where we are going with this, open up your terminal and type:

say –v Kanya “ขอโทษที่มาสาย”
You should hear the Kanya voice say in perfect Thai:

“kʰɔ̌: tʰô:t tʰî: ma: sǎ:i”.

While being able to cut and paste efficiently is a very important skill to have here, I can’t stress how useful it is for you to learn to type in the language that you’re learning as fast as possible. As soon as you learn the script of a language, you should be teaching yourself to type at the same time. That makes accessing and finding language specimins much easier online, as well as being able to edit them and customize your own learning materials.

What I would like to create now is an audio file that I can loop and play back while I’m walking around town or taking a shower. I would like each file to say what number sentence it is, the English and then the Thai.

Let’s write the flow of what each line should be like this:

count : EN : TH

‘count’ is the number sentence
‘EN’ is the English sentence from each row
‘TH’ is the corresponding Thai sentence

The fastest way to get your computer to say each of the sentences would be to just pipe the AWK output we have through into the ‘say’ program.
awk -F, ‘{print $1, $2}’ vocab.csv | say -v Kanya

The problem with doing this is that you get Kanya’s lovely voice pronouncing all of the English sentences in ‘Tinglish’. While it sounds novel for a couple of sentences, after a while it can get frustrating.

Getting 'Karen' to Say the English Sentence and 'Kanya' to say the Thai Sentence

I want to take the English sentence and pipe it into the ‘say’ program using an English speaking voice (I will use ‘Karen’, the Australian voice), and then take the Thai translation of that sentence and have ‘Kanya’ say it.

To get just the first line, for the English sentence we could type this:

awk -F, ‘{print $1}’ vocab.csv | sed -n 1p | say -v Karen

and for the Thai sentence, we could type this:

awk -F, ‘{print $2}’ vocab.csv | sed -n 1p | say -v Kanya

To get the English and Thai together, you end up with a very long command. I separate each of the commands with a semi-colon ‘;’:

awk -F, ‘{print $1}’ vocab.csv | sed -n 1p | say -v Karen;awk -F, ‘{print $2}’ Vocab-Builder-Test.csv | sed -n 1p | say -v Kanya

The good news is that this can all be done very quickly and efficiently for the whole csv file using a saved bash script. I have written it for you, so all you need to do is run it from the command line.

Making it Much More Simple

If you would like to make the bash script file yourself, open up any text editor and cut and paste the following text into it. Then save it as ‘vg.sh’. I named the file ‘vg’ as a short way of writing ‘voice generator’. Shell scripts have a file extension of .sh. My text editor of choice right now is Sublime Text. You can see it in the screen shot below.

Here is the script:

#!/bin/bash
#Ensures the file separator is a comma
OIFS=$IFS

#When running the script, the first parameter will be filename that holds the sentences
INPUTFILE=$1

#2nd parameter will be the No. Sentence to start with from the csv file.
START=$2

#3rd parameter is the 'final sentence no. from the csv file.
END=$3

#Resets the file count to 1
FILECOUNT=1

#Sets the counter to the Start number sentence entered by user
COUNTER=$START

#Reconfirms comma as the separator between fields in csv file
IFS=,

#Sets a temp folder name to assemble temp files to generate recordings
TEMPFOLDER="tmp-${START}-${END}"

#If the temp folder exists, it is deleted to make a fresh start
rm -rf "${TEMPFOLDER}"

#creates the temp folder with the name set above
mkdir -p "${TEMPFOLDER}"

#sets a temp file name for a temp csv file that will hold only the selection of sentences to say
FILENAME="input-${START}-${END}.csv"

#Uses AWK ensuring comma delimeter (-F,) then sets Lines (NR) 'Start' to 'End' that were entered in
# {print LANG1, Lang2} - from input csv vocab file, then using > to export to the temp file name
awk -F, 'NR=='$START',NR=='$END' { print $1","$2}' $INPUTFILE > "${TEMPFOLDER}/${FILENAME}"

# Starts a loop reading through the file
while read lang1 lang2

# anything inside 'do' will be done as long as there are still lines left in the file
do

# Executes the 'say' function for language 1 of the number and first word and then records it as an aiff audio file. Note the [[slnc 1000]] inserts a 1sec (1000 milisecond) pause after the number.
say $COUNTER "[[slnc 1000]]" $lang1 -o "${TEMPFOLDER}/$(printf %05d $FILECOUNT).${COUNTER}.en.aiff"


#increases the total file count by 1
let FILECOUNT+=1
#changes the voice to Kanya (thai voice) and reads the thai column word and records it as an aiff audio file
say -v Kanya $lang2 -o "${TEMPFOLDER}/$(printf %05d $FILECOUNT).${COUNTER}.th.aiff"

#increases the file count by 1
let FILECOUNT+=1

# Increases the overall sentence count by 1 (note there are 2 audio files per 1 sentence - one per language)
let COUNTER+=1

# Once all lines are done, it will export it as the file name to the temp folder
done< "${TEMPFOLDER}/${FILENAME}"

#resets the systems standard field separator to normal
IFS=$OIFS

#use the sox audio programme to convert all the individual sentence files into one big wav file
#NOTE - if you haven't installed sox on your system, you can do it via the homebrew package manager
#If Homebrew is installed on your system, just type into the command line - brew install sox
# sox must be installed before running this script
sox $(ls ${TEMPFOLDER}/*.aiff | sort -n) "output-${START}-${END}.wav"

How to Run The Script

When running the script, in the terminal while you are in the same directory as the .csv file with your words in it, type the following:

bash [shell script file name] [csv filename] [start sentence number] [end sentence number]]

That means, if you would like to make an audio file of sentences 1 to 5 and my csv file is called vocab.csv and I saved my bash script as vg.sh, I would type:

bash vg.sh vocab.csv 1 5

Check your File!

If all went well, you will have an audio file in .wav format sitting there in that same directory with all the sentences in perfect order called output-1-5.wav:

Click to hear the output file below:

You can play around with the scripts. Maybe you want to have them repeat the sentence twice in the new language you’re learning. You can use the –r switch to speed up or slow down the speech in the ‘say’ program.

How to Make Flashcards from the Same File

The last thing to complete this customised language tool development tutorial is to make some quick and easy physical flash cards.

The way I do this is to use that same CSV file, create a template in MS Word or any other word processor that has Mail Merge functions and do a mail merge using the fields in the CSV file.

I’ll show you what I mean.

Supposing I wanted to make some cards that only had the Thai writing and transliteration on it.

Just open up a new MS Word document and create a card prototype that you would be happy with. Here’s a quick one I made up with the Thai and transliteration from the csv file we have.

Open the csv file up in MS Excel. If the encoding is not working properly, you can go to your original file in Google Sheets and just cut and paste the data across. Save that file as an Excel (.xlsx) of the same name. Add a row at the top and give the columns names “English”, “Thai” and “Transliteration”. You can name them whatever you want.

Now go to Tools - Mail Merge Manager and in the ‘Create New’ section, select ‘Catalogue’

Click ‘Data Source’ and locate the new Excel (xlsx) file with the column headers and select it. Drag the fields of the columns you would like to display to the appropriate places on your flash card template.

Follow the prompts and you should see the fields from the column headings show up as fields that you can drag and drop into your document. Drag them into the text place holders that you already set up.

Now all you have to do is click the ‘Merge’ button down in the bottom section of the Mail Merge Manager dialogue box and you will have a new file generated with beautifully formatted flash cards for each sentence.

Play around with the formatting and if you are using tables, you can set them so that they won’t break across pages. All you need to do now is print them out, cut them out and use them in conjunction with your newly produced custom audio files.

Was it really necessary to go to all this trouble to create your own learning resources? I would say yes, it most certainly is! In spending this quality time with the texts, finding your data, manipulating texts, making your own audio files and cards etc really adds value to the language for you. You have invested your time and energy in it and you can hone your solution to suit you precisely. As you play around with the text and audio and make them, you will be picking up the language along the way too. If you type the words in by hand into the spreadsheet, with every word you type, you are reinforcing it into your own memory and that experience typing it in works as a memory peg for those words.

This is just the tip of the iceberg. Go and have a play with all the different tools available to you. Do some tutorials on using the bash terminal and commands like AWK and SED, and try your hand at making your own permutations of flash cards with different data fields to serve different purposes depending on what area you’re targeting. You might leave the transliteration off and test how well your reading is. Now you have the tools to create whatever works for you.

Now it’s your turn. Let me know what kind of resources you have managed to make for yourself.

As an update to this original post, thanks to Kris Willems and other people from the Farang Can Learn Thai group on Facebook who have gone and found out much more that can be done. Especially in the way of using Anki flash cards and using OSX’s voice module or Google’s with the AwesomeTTS plugin. See the article they referenced here: