There are currently two major commercial systems bringing "artificial intelligence" into the home: Google Home (via Google Activities) and Amazon Alexia. I put "artificial intelligence" in quotes as the principal AI component seems to be speech recognition, and in Google's case at least, the ability to deliver information based on its "big data" processing of the Web.
Be that as it may, Mycroft is a reasonably open source system playing in the same space. That is, it listens to talk in the home and when a trigger phrase is encountered, will attempt to perform actions based on the phrases spoken next.
A Mycroft client may be downloaded to a Linux system and built. It contains a core and and extensible set of skills. The skills perform tasks such as responding to "what is the time?" (which can be answered locally) or "what is the weather?" (which will invoke a Web request). These skills are written in Python and follow a simple structure discussed by example later.
The document Mycroft core overview discusses the toplevel view of the Mycroft architecture. It gives the normal path of a query as
Mycroft runs within a Python virtual environment. This is done
to esnure that the correct runtime environment is there - it is
all stored in the virtual environment. This avoids problems
with missing packages, or packages upgraded in
incompatable ways. It also ensures that python
resolves to python2
and not python3
!
But now if you want to use a package that is not in
the virtual environment, then you have to install it there,
which is a bit more complicated than just using
pip
to install a package into the normal
environment.
To install a package such as fuzzywuzzy
, run
$source venv-activate.sh
Entering mycroft-core virtual environment. Run 'deactivate' to exit
(.venv) Desktop:$pip install fuzzywuzzy
Collecting fuzzywuzzy
Using cached fuzzywuzzy-0.15.1-py2.py3-none-any.whl
Installing collected packages: fuzzywuzzy
Successfully installed fuzzywuzzy-0.15.1
Leave the virtual environment by
deactivate
The command start.sh
has a possible
parameter of sdkdoc
. Running this fails
as the Python module pdoc
is not in
the virtual environment. You need to install it as
in the last section by starting the virtual environment
and using pip
to install pdoc
.
Then you can run
start.sh sdkdoc
The resultant HTML files are in the subdierctory
build/doc/mycroft-skills-sdk/html/mycroft/
.
Currently the documentation isn't really worth looking at
The default configuration shows a location as Lawrence in Kansas. However, setting the location in the config file probably isn't the best waty of doing of it: when Mycroft starts, it queries the Mycroft web backend for its location. Easier is to login in to home.mycroft.ai, select devices and then your chosen device. This allows the location to be set.
For anyone with a multitude of devices in the home, keeping track of remotes is a nightmare. Universal remotes replace these with a single remote controller which emulates these other ones. But they still control only a single device.
I have a set of linked devices. When I want to watch something on my Blu-Ray device, I have to turn it on, turn on and set the TV to the correct HDMI input, turn on and set the amplifier to the Blu-Ray input. Such as set of events for a single activity is common. The Logitech Harmony series of devices address this issue by allowing a set of devices to be configured under the banner of an activity.
I have used the Harmony One for many years as an activity-based remote. Tedious to set up, but a joy to use afterwards. The Harmony Hub is the latest model, but its remote now talks to a Hub device which is connected to your network by WiFi and uses IR to talk to your devices. The network capability brings it into the world of the IoT.
Logitech has worked with Google to develop an Activity so that Google Home can recognise and transfer commands related to the Hub such as
Okay Google, ask Harmony to watch Blu-Ray
Google Home is great, but I can't see the details of the activity,
so I'm relying on Logitech and Google to keep that interaction
working.
There is also an IFTTT stanza to talk to the Harmony Hub. IFTTT is great, but I can't see the details of the stanza, so I'm relying on Logitech and IFTTT to keep that interaction working.
These are external services not under my control to manipulate a device within my own home. Not completely satisfactory. Fortunately there has been an open source effort to decode the protocol used by the Harmony Hub. The latest version of this is a Java package at tuck182/harmony-java-client: Java client for communicating with a Harmony Hub .
The Java Hub client makes use of a debugging package which is great for ... debugging. But it gets in the way when trying to interact with it programmatically, as both results and debug output appear mixed together on stdout.
Debugging is done using the slf4j debugging framework which is implemented by a debugging library such Log4J. Debugging is nice most of the time, but it is sometimes nice to be without it. Unfortunately, there isn't a simple way to turn it off.
One way that works for me is to include the file
slf4j-nop.jar
in the classpath, which is a
"no-op" debugger. But if you have jar files included
using the -jar
option then the classpath
is ignored. Also, there is a default debugger implementation
included in the Mycroft jar file. So first get rid of that default
by deleting it from the jar file:
zip -d harmony-java-client-master-1.2.1-all.jar /org/slf4j/impl/StaticLoggerBinder.class
Then you can start the Java Hub client using the classpath option by
java -cp .../slf4j-nop.jar:.../harmony-java-client-master/build/libs/harmony-java-client-master-1.2.1-all-nodebug.jar net.whistlingfish.harmony.Main HUB_IP
where the '...' are the relative or absolute paths to the jar files
and HUB_IP
is the Ip address of the Harmony Hub.
The Java client reads commands in simple English from stdin and write responses to stdout. It always starts by telling you the current activity according to the Hub.
Request | Response |
---|---|
activity changed: [28199547] Listen to Radio | |
list activities |
28199546: Play Game 28199556: Chromecast 28199554: External PC -1: PowerOff 28238778: Watch Blu-Ray 28199547: Listen to Radio |
Any skill requires at least one Python program
to execute commands, called intents. It also requires a
directory vocab/en-us
listing
the phrases (in US english) to trigger an intent,
and a directory dialog/en-us
for reponses (in US English). The language can of course
be changed.
The file structure for the Mycroft Harmony Hub client is
./__init__.py
./vocab/en-us/StartActivityKeywords.voc
./vocab/en-us/ListDevicesKeywords.voc
./vocab/en-us/ListActivitiesKeywords.voc
./vocab/en-us/ShowActivityKeywords.voc
./dialog/en-us/start.activity.dialog
./dialog/en-us/list.devices.dialog
./dialog/en-us/list.activities.dialog
./dialog/en-us/show.activity.dialog
The file StartActivityKeywords.voc
contains the lines
harmony start
harmony begin
which are the triggers for the "start activity" intent.
The file start.activity.dialog
contains
the response for successful invocation and contains
Starting activity {{activity}}
where {{activity}}
will be a variable
substitution set by the Python code.
the file __init__.py
contains code to handle
each intent. Corresponding to the dialogs and vocabularies
listed earlier there are the intents
StartActivityIntent
ListDevicesIntent
ListActivitiesIntent
ShowActivityIntent
Each intent is associated to the vocabulary that triggers it by
IntentBuilder("ListActivitiesIntent").\
require("ListActivitiesKeywords").build()
for example.
There are several ways of associating code to this intent,
and now Mycroft seems to be favouring using a "decorator"
mechanism which is much less readable than earlier ones.
The following links the method list_activities_intent
to the ListActivitiesIntent
handler.
@intent_handler(IntentBuilder("ListActivitiesIntent").\
require("ListActivitiesKeywords").build())
def list_activities_intent(self, message):
"""List all activities Harmony knows about"""
Now we can turn to the code to process the intent.
We use a helper method (given later) send_command
which writes a string to the Java Hub client and returns a
string of the response. Every response starts with the
current activity, so for all the intent methods lose this first line,
and return an array of lines by
result = self.send_command('list activities')
# lose first line "activity changed..."
activities = result.split("\n", -1)[1:]
The activities are listed in the form id: name
,
so now it is a matter of working through the list
and showing the name.
The output is given by calling the inherited
speak_dialog
method which takes a list of
variable/values patterns to be interpolated into the
dialog (which contains the pattern {{activity}}
).
The complete code for the ListActivitiesIntent
is
@intent_handler(IntentBuilder("ListActivitiesIntent").\
require("ListActivitiesKeywords").build())
def list_activities_intent(self, message):
"""List all activities Harmony knows about"""
LOGGER.debug("Harmony: list activities")
result = self.send_command('list activities')
# lose first line "activity changed..."
activities = result.split("\n", -1)[1:]
for activity in activities:
activity_name_loc = string.find(activity, ":")
# ignore non ':' lines
if activity_name_loc >= 0:
activity_name = activity
report = {"activity": activity_name}
self.speak_dialog("list.activities", report)
An intent such as "list activities" doesn't take any parameters. An intent such as "start Blu-Ray" does have a parameter, and the speech recognition engine will need to get that right. The Java Hub client on my home system will recognise "Blu-Ray", but not "Blu Ray", "blu-ray", "blue ray" or any of the other possibilities. It will however, recognise "28238778" as that is the id belonging to that activity. Something has to disambiguate the recognised speech to an unambiguous form that the device handler will recognise.
Perhaps the Harmony Hub recognises all the different forms that could be sent by IFTTT or by Google Home: without access to the code we don't know. We can change the code of the open source Java Hub client, but for now it is just easier to disambiguate the text in the Python intent, and send the unambiguous id. So we do pattern matching in the Python intent.
There are several pattern matching engines available as
Python packages. I chose the fuzzywuzzy
package. I use this to take an array of activity strings of the form
"id: name", finds the best match of the name against
an inut pattern, and returns the id:
def best_match(self, str, options):
"""Return best fuzzy match to a string from a list of strings
"""
max_match = 0
option_match = ""
for option in options:
colon_at = string.find(option, ':')
if colon_at >= 0:
option_name = option[colon_at+2 : ]
fuzz_value = fuzz.ratio(str, option_name)
if fuzz_value > max_match:
max_match = fuzz_value
option_match = option
return option_match
That gives us the best id match, once we have isolated
the string to match against. Here it gets a bit obscure and
undocumented. Each intent handler is passed a message
parameter. This is of type mycoft.messagebus.Message
and has a number of fields, including a dictionary.
The dictionary key utterance
contains the spoken string such as "harmony start blue ray".
There are (human) language dependencies here: a French version
might be "harmomy commencer[sic] bleu[sic] ray", while a Chinese
version might be "harmony 开始 blue ray". (No, sorry, I don't
have multilingual versions at present, but that is what they could
look like if I did.). So to extract the payload
of the key string, we have to discard
the possibly language dependent initial phrase.
This is contained in the dictionary entry
StartActivityKeywords
.
Putting that all together gives us
@intent_handler(IntentBuilder("StartActivityIntent").\
require("StartActivityKeywords").build())
def start_activity_intent(self, message):
"""Start a Harmony activity
The message has the activity, but maybe not quite in
the form required by the Harmony controller.
So get the accepted list, match against it and invoke
the best match
"""
LOGGER.debug("Harmon: start activity" + message.data.get('utterance'))
key = str(message.data.get(u'StartActivityKeywords'))
utterance = str(message.data.get(u'utterance'))
payload = string.replace(utterance, key, "")
payload = string.strip(payload)
LOGGER.debug("Harmony: utterance: " + utterance + " key " + key + " payload " + payload)
# Java interface to Harmony is case sensitive
# and we are best off using the activity id rather than name
# so first we have to get the activities with id and name
result = self.send_command('list activities')
# lose first line "activity changed..."
activities = result.split("\n", -1)[1:]
# and find the best match
activity = self.best_match(payload, activities)
# activity is of the form "id: name", get the id
activity_name_loc = string.find(activity, ":")
id = activity[0 : activity_name_loc]
activity_name = activity[activity_name_loc+2 : ]
LOGGER.debug("Harmony: id: " + str(id))
result = self.send_command('start ' + id)
# say what is happening
report = {"activity": activity_name}
self.speak_dialog("start.activity", report)
This method sends a mesasge to the Java Hub client and returns the string response
def send_command(self, command):
"""Send a command to the Java Harmony controller
Returns a string of the response from the Java process
"""
p = Popen(["java", "-cp",
JAVA_CLASSPATH,
"net.whistlingfish.harmony.Main",
HARMONY_HOST_IP], stdin=PIPE, stdout=PIPE, bufsize=1)
# communicate returns a tuple of one string element
(response,) = p.communicate(command)[0], # signal the child to exit,
# read the rest of the output,
# wait for the child to exit
return response
Running Mycroft on yuor PC or laptop is one way to go. Buying the cool looking Picroft is another way.
A third way is to build your own device in some other container. A constraint is that you will need a good, small
microphone that can be built into your Mycroft "container". Two choices are the
Matrix Voice
and the Google AIY Voicehat
as part of the Google AIY voice kit
I bought the AIY voice kit, but it was a hassle connecting it up to all the Google services, and anyway, I wanted to play Mycroft. Now there is an image that will run on the RPi3B+: follow the instructions in HACKING.md at the aiyprojects-raspbian site. This image has drivers for the Google AIY Voicehat.
Then you can download the latest version of Mycroft for Linux and build that in the normal way for Mycroft. It takes time to build, but the resultant Mycroft system runs fine on the Voicehat.
An alternative, of adding the Voicehat drivers to Picroft is not feasible for me at the moment as the only spare RPi's that I have are all model 3B+ and as of Sept 2, 2018 this model is not supported. The HACKING.md page contains instructions about the drivers needed when it becomes possible to run Picroft on the 3B+
Copyright © Jan Newmarch, jan@newmarch.name
"The Internet of Things - a techie's viewpoint" by Jan Newmarch is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://jan.newmarch.name/IoT/.
If you like this book, please donate using PayPal