Open-source speech recognition and text-to-speech potentially usable with the Poppy robots

software
Tags: #<Tag:0x00007feac7ca2c18>

#16

Yes, certainly, lets’ see when they will publish the list of deported solution for STT if there is also other open source solution than Julius and CMU sphinx , it could be interesting to add it here.

I put here their text about this for the other readers of this tread :
"
We want to be transparent with our supporters, but at the same time
we need to keep some details of our project private until negotiations
are complete and contracts are signed. Remember that Kickstarter is not
a store. It is a place to build support and access capital to complete
amazing projects. Our platform is still more than 10 months away from
release and we have a lot of details to iron out before we ship.

We are currently evaluating several STT application interfaces
(APIs). Our software is designed to use multiple APIs simultaneously.
Partially this is to improve performance, but it is also to prevent
getting locked into a single technology or vendor. When we’ve selected
and executed agreements with our upstream STT providers we will
communicate our selection to end users.

We will also remain open to adding STT vendors in the future or bringing this portion of our technology in-house.

To preserve end user privacy we are looking at several mechanisms to
randomize STT query destinations, mask IP addresses and conceal other
personally identifiable information.

Mycroft is open source so users who don’t like our STT or AI selection can always deploy their own STT or AI back end.

Last updated:
Wed, Aug 12 2015 10:04 PM CEST"


#17

@oudeyer I have discovered the presentation of this Mooc : https://www.france-universite-numerique-mooc.fr/courses/inria/41004/session01/about made by the Inria of Grenoble. Maybe contacting the author or asking them the authorisation to consult the forum of this Mooc will allow us to complete the first post wiki ?

This article http://www.objetconnecte.com/rentree-moocs/ seems to means that there will have a new session of this mooc.
If it’s right, it’s maybe an opportunity to contact the authors to think about how some activities asked to the students can enhance the use and common knowledge of the open source SR and TTS ? (example of an collaborative activity : participate to an collaborative benchamarking on the existing open source TTS solution in different languages using for example a common text (that will contain ? ! : numbers acronyms…) and putting the resulting audio file in a wiki ?)


Presentation of the ODOI project
#18

Other link…

Here lot of feedback about the possibility to use nice opensource solution of SR and TTS but without nice result :
http://blog.idleman.fr/raspberry-pi-09-creer-une-interface-vocale/

> https://itechnofrance.wordpress.com/2013/02/04/donnez-de-la-voix-votre-raspberry/
> Idleman Auteur de l’article
> espeak est assez moche mais fonctionnel oui :), j’ai déja testé il est simple d’utilisation et on comprends a peu près, mais il ne fait que synthese (le plus simple a trouver en open source local) et pas reconnaissance vocale

j’ai vu que quelqu’un l’avais déjà proposé mais le commentaire semble être passé inaperçu dans la masse.
Pourquoi ne pas utiliser jasper ? http://jasperproject.github.io/documentation/ , je compte l’essayer,
je suis encore en attente du livreur pour rasp b+


Idleman Auteur de l’article
Parce que jasper utilise pocketsphinx, qui ne reconnais que très peu de mots en français, qui est un cauchemar a compiler et dont les performances sont relativement médiocres :).

If someday someone find a nice open source solution he can also post it on this tread ( https://itechnofrance.wordpress.com/2013/02/04/donnez-de-la-voix-votre-raspberry/ ) they will be happy to use it.


#19

Maybe a very good news :

http://www.tensorflow.org/ is now under apache licence. We will be able to have a quality speech recognition in french without sending the data to google and/or to have a local installation of the SR that can be used without internet connexion ?

What do you think about ?


#20

Thanks for the link. I do not understand very well. In the video, they say a lot about the speech recognition but regarding the examples, I do not see any think about speech recognition.
I have the read more in details.
But +1 for the fact all is in Python :slight_smile:


#21

Yes, i haven’t seen either real examples about SR :confused:


#22

Which TTS or speech recognition could run on the Odroid? I was wondering whether MaryTTS for instance could run easily on such a small processor.
@Laura , @Sophie or @Maximilien, which systems have you tried? Why did you finally choose gTTS?


#23

Hi there,
Here a feedback from IntRoLab (university of Sherbrooke) : Interaction homme-robot par la voix : on se comprend mon ami ! http://cursus.edu/dossiers-articles/articles/27285/interaction-homme-robot-par-voix-comprend/

They have used : (open source)

and non open source :
Google speech API

More information about open source component used can be found in https://introlab.3it.usherbrooke.ca/mediawiki-introlab/index.php/ManyEars


#24

Hi there,

meSpeak.js is a Text-To-Speech solution on the Web ."speak.js is 100% clientside JavaScript. “speak.js” is a port of eSpeak, an open source speech synthesizer, which was compiled from C++ to JavaScript using Emscripten."
The project is under GPL. (thanks to Johann, who have quoted about meSpeak.js on a framapad, Johann makes nice open source SVG tests for children, including programming http://jlodb.poufpoufproduction.fr/tibibo.html?id=prog )

French is available. It’s look like “robot voice”, but as it’s used for robot, it’s maybe not a big problem.

Do you think it’s could be interesting for Poppy robots ?

And easily usable with ardiuno+wifi or nodemcu ?


#25

Article published today about MyCroft, by Ubuntu team https://insights.ubuntu.com/2016/07/07/mycroft-the-open-source-answer-to-natural-language-platforms/


#26

Very interesting the Mycroft project !!
Here is the community
Here is the code !!

Navigating in the forum, I also saw the very sexy AI samurai
But it seems to be NOT open source.


#27

Hello,

Top 5 Open Source Speech Recognition Toolkits : http://blog.neospeech.com/2016/07/08/top-5-open-source-speech-recognition-toolkits/ (thanks to guildem to share it in http://linuxfr.org/nodes/110556/comments/1682139 )


#28

And here the solutions mentioned in the wiki for kalliope (https://www.youtube.com/watch?v=t4J42yO2rkM amazing - : http://linuxfr.org/news/kalliope-votre-assistant-personnel-vocal )


#29

Hello,

On http://linuxfr.org/news/kalliope-votre-assistant-personnel-vocal, Sylvain Chevalier talk about http://kaldi-asr.org/ :

Depuis déjà plusieurs années, le “standard” pour la reconnaissance vocale libre c’est kaldi,
en particulier grâce à ses modules pour l’apprentissage profond (Deep
Learning en anglais, d’ailleurs en passant je trouve le concept de
"neuron" pas bien choisi pour un projet dans ce domaine, où les “neural
nets” sont partout). La plupart des systèmes commerciaux l’utilisent.

Someone has already tried it for speech recognition ?


#30

Bonsoir,
Je viens de m’inscrire ici.
A ce sujet, j’utilise la reconnaissance vocale avec Snowboy (sur Pcduino et odroid x4) c’est du python,
https://snowboy.kitt.ai/
Ca marche plutôt bien.
Amusez-vous bien…
kookic


#31

ça n’a pas l’air open source, non ? on cherche à mutualiser sur des solutions open source…


#32

exact, mais pour moi c’est “version free”


#33

Mycroft : nouveau financement collaboratif, avec la part belle à l’open source pour les technos, où les solutions retenues sont mis en avant dans la campagne. En anglais. Une version open source gérant le français sera-t-elle portée par la communauté en étant catalysé par ce nouveau modèle de Mycroft ? à suivre… https://www.kickstarter.com/projects/aiforeveryone/mycroft-mark-ii-the-open-voice-assistant


#34

Sur primtux, Stéphane parle de gspeech pour du TTS : https://forum.primtux.fr/viewtopic.php?pid=14553#p14553 : https://github.com/lusum/gSpeech

Philippe nous indique “AccessDV Linux, une distribution destinée aux déficients auditifs, qui intègre de nombreux outils intéressants, notamment leur machine à lire, un ensemble de scripts bash permettant la lecture automatique depuis de nombreuses sources.”


#35

De nouveaux liens :

https://blog.mozilla.org/press-fr/2019/02/28/common-voice-mutualiser-nos-voix-mozilla-publie-le-plus-grand-jeu-de-donnees-vocales-transcrites-du-domaine-public-a-ce-jour/