Elasticsearch Frustration: The Curious Query

Last year I was poking at an Elasticsearch cluster to review the indexed data and verify that things were healthy. It was all good until I stumbled upon this weird document:

  "_version": 1,
  "_index": "events",
  "_type": "event",
  "_id": "_query",
  "_score": 1,
  "_source": {
    "query": {
      "bool": {
        "must": [
            "range": {
              "date_created": {
                "gte": "2016-01-01"

It may not be immediately obvious what's going on in the above snippet. Instead of a valid event document, there's a document with a query as the contents. Additionally, the document ID appears to be _query instead of the expected GUID. The combination of these two irregularities makes it seem as if someone accidentally posted a query to the wrong endpoint. No problem, just delete the document, right?

DELETE /events/event/_query
ActionRequestValidationException[Validation Failed: 1: source is missing;]


I reached out to some of my coworkers to see if they could point me in the right direction, but all that I received was an (unhelpful) "I've seen this error before, and we solved it, but no one seems to remember how it was done." Great.

After much head-scratching, it turns out that, since the ID is _query, Elasticsearch's URL router thinks that I'm trying to issue a query and validates the HTTP action as such. Part of that validation is the requirement that queries have a body. Oops.

While passing an empty object should conceivably have worked, I wanted to play things extra safe in case ES was executing the query (this was production, after all (why are you looking at me like that?)), so I passed in a query object that constrained the results to only the problematic document.

DELETE /compositeevents/compositeevent/_query
    "query": {
        "match": {
            "_id": "_query"

... and the document was deleted successfully! Hopefully putting this to blog form will help others who encounter it in the future (including me).

Switching to NeoVim (Part 2)

2016-11-03 Update: Now using the XDG-compliant configuration location.

Now that my initial NeoVim configuration is in place, I'm ready to get to work, right? Well, almost. In my excitement to make the leap from one editor to another, I neglected a portion of my attempt to keep Vim and NeoVim isolated - the local runtimepath (usually ~/.vim).

"But Aru, if NeoVim is basically Vim, shouldn't they be able to share the directory?" Usually, yes. But I anticipate, as I start to experiment with some of the new features and functionality of NeoVim, I might add plugins that I want to keep isolated from my Vim instance.

I'd like Vim to use ~/.vim and NeoVim to use ~/.config/nvim. Accomplishing this is simple - I must first detect whether or not I'm running NeoVim and base the root path on the outcome of that test:

if has('nvim')
    let s:editor_root=expand("~/.config/nvim")
    let s:editor_root=expand("~/.vim")

With the root directory in a variable named editor_root, all that's left is a straightforward find and replace to convert all rtp references to the new syntax.

e.g. let &rtp = &rtp . ',.vim/bundle/vundle/'let &rtp = &rtp . ',' . s:editor_root . '/bundle/vundle/'

With those replacements out of the way, things almost worked. Almost.

I use Vundle. I think it's pretty rad. My vimrc file is configured to automatically install it and download all of the defined plugins in the event of a fresh installation. The first time I launched NeoVim with the above changes didn't result in a fresh install - it was still reading the ~/.vim directory's plugins.

Perplexed, I dove into the Vundle code. Sure enough, it appears to default to installing plugins to $HOME/.vim if a directory isn't passed in to the script initialization function. It appears that I was reliant on this default behavior. Thankfully, this was easily solved by passing in my own bundle path:

call vundle#rc(s:editor_root . '/bundle')

And with that, my Vim and NeoVim instances were fully isolated.

Switching to NeoVim (Part 1)

2016-11-03 Update: Now using the XDG-compliant configuration location.

NeoVim is all the rage these days, and I can't help but be similarly enthused. Unlike other editors, which have varying degrees of crappiness with their Vim emulation, NeoVim is Vim.

If it's Vim, why bother switching? Much like all squares are rectangles, but not all rectangles are squares, NeoVim has a different set of aspirations and features. While vanilla Vim has the (noble and important) goal of supporting all possible platforms, that legacy has seemingly held it back from eliminating warts and adding new features. That's both a good thing and a bad thing. Good because it's stable, bad because it can lead to stagnation. A Vim contributor, annoyed with how the project was seemingly hamstrung by this legacy (with its accompanying byzantine code structure, project structure, and conventions), decided to take matters into his hands and fork the editor.

The name of the fork? NeoVim.

It brings quite a lot to the table, and deserves a blog post or two in its own right. I'll leave the diffing as an exercise to the reader. I plan on writing about some of those differences as I do more with the fork's unique features.

So, what did I need to do to switch to NeoVim? I installed it. On Kubuntu, all I needed to do was add a PPA and install the neovim package (and its Python bindings for full plugin support).

$ sudo add-apt-repository -y ppa:neovim-ppa/unstable
$ sudo apt-get update && sudo apt-get install -y neovim
$ pip install --user neovim

Next up, configuration - one of Vim's great strengths. I dutifully keep a copy of my vimrc file on GitHub, and deploy it to any workstation I use for prolonged periods of time. It'd be nice if I could carry it over to NeoVim.

Suprise! It Just Works™. Remember, NeoVim is Vim, as such it shares the same configuration syntax. Since I don't think I'm doing anything too crazy in my vimrc, it should be a drop-in operation.

$ mkdir -p ~/.config/nvim
$ ln -s ~/.vimrc ~/.config/nvim/init.vim

After that, it's a simple matter of invoking nvim from the command line. Everything loaded and worked for me from the first run!


This pleasant detour over, I went to resume lolologist development. However, when I activated my virtual environment and fired up nvim, I got a message stating:

No neovim module found for Python 2.7.8. Try installing it with 'pip install neovim' or see ':help nvim-python'.

Hm. That's strange. The relevant help docs, however, tell us all we need to know - the Python plugin needs to be discoverable in our path, and, since I'm using a virtual environment, a different Python instance is being used. This is easily addressed, as detailed in that help doc. However, since I use this vimrc file on two platforms (Linux & OS X), I need to be a little smarter about hardcoding paths to Python executables. I added this to my vimrc (it shouldn't negatively impact my Vim use, so it's fine to be in a shared configuration).

if has("unix")
  let s:uname = system("uname")
  let g:python_host_prog='/usr/bin/python'
  if s:uname == "Darwin\n"
    let g:python_host_prog='/usr/local/bin/python' # found via `which python`

Restarting NeoVim with that configuration block in place let it find my system Python and all associated plugins.

I'll keep this site updated with any new discoveries and NeoVim experiments! I'm quite eager to see how the client-server integrations flesh out.

I've written more on this! Part 2.

Accessing Webcams with Python

So, I've been working on a tool that turns your commit messages into image macros, named Lolologist. This was a great learning exercise because it gave me insight into things I haven't encountered before - namely:

  1. Packaging python modules
  2. Hooking into Git events
  3. Using PIL (through Pillow) to manipulate images and text
  4. Accessing a webcam through Python on *nix-like platforms

I might talk to the first three at a later point, but the latter was the most interesting to me as someone who enjoys finding weird solutions to nontrivial problems.

Perusing the internet results in two third-party tools for "python webcam linux": Pygame & OpenCV. Great! Only problem is these come in at 10MB and 92MB respectively. Wanting to keep the package light and free of unnecessary dependencies, I set out to find a simpler solution...

Read more…

Disabling caching in Flask

It was your run of the mill work day: I was trying to close out stories from a jam-packed sprint, while QA was doing their best to break things. Our test server was temporarily out of commission, so we had to run a development server in multithreaded mode.

One bug came across my inbox, flagging a feature that I had developed. Specifically a dropdown that displayed a list of saved user content wasn't updating. I attempted to replicate the issue on my machine, no-luck. Fired up my IE VM to test and, yet again, no luck. Weird.

I walked over to the tester's desk and asked her to walk me through the bug. Sure enough, upon saving an item, the contents of the dropdown did not update. Stumped, I returned to my machine and spoofed her account, wondering if there was an issue affecting her account within our mocked database backend (read: flat file). I was able to see the "missing" saved content. Now I was getting somewhere.

I walked back to her machine, opened the F12 Developer Tools, and watched the network traffic. The GET for that dynamically-populated list was resulting in a 304 status, and IE was using a cached version of the endpoint. Of course, I facepalmed as this was something I've not normally had to deal with as I usually use a cachebuster. However, for this new application, my team has been trying to do things as cleanly as possible. Wanting to keep our hack count down, I went back to my machine to see what our web framework could do.

Coming from a four-year stint in ASP.NET land, I was used to being able to set caching per endpoint via the [OutputCache(NoStore = true, Duration = 0, VaryByParam = "None")] ActionAttribute. Sadly, there didn't appear to be something similar in Flask. Thankfully, it turned out to be fairly trivial to write one.

I found this post from Flask creator Armin Ronacher that set me on the right track, but the snippet provided didn't work for all the Internet Explorer versions we were targeting. However, I was able to whip together a decorator that did:

To invoke it, all you need to do is import the function and apply it to your endpoint.

from nocache import nocache

def my_endpoint():
    return render_template(...)

I took this back to QA and was met with success. The downside of this manual implementation, however, is one needs to be religious in applying it. To this day we still stumble upon cases where a developer forgot to add the decorator to a JSON endpoint. Thankfully, code review processes are perfect for catching that sort of omission.

Multithreading your Flask dev server to safety

Flask is awesome. It's lightweight enough to disappear, but extensible enough to be able to get some niceties such as auth and ACLs without much effort. On top of that, the Werkzeug debugger is pretty handy (not as nice as wdb's, though).

Things were going swimmingly until our QA server went down. While the server may have stopped, development didn't, and we needed a way to get testable builds up and running for QA. One of my fellow developers quickly stood up an instance of our application on a lightly-used box that was nearing the end of its usefulness. To get around the fact that the machine wasn't outfitted for httpd action, the developer just backgrounded an instance of the Flask development server and established an SSH tunnel for QA to use. This was deemed acceptable by our QA team of one. We rejoiced and went back to work.

Days passed and our system engineer's backlog remained dangerously full, with the QA server remaining a low priority. Thankfully, aside from having to restart the process a few times, the Little Server That Could kept on chugging. But then disaster struck - we hired another tester! Technically her joining was a great thing; when you bring in fresh blood you get not only another person working the trenches but also an influx of fresh ideas. However, her arrival brought some unforeseen trouble - the QA server would get more than one user! This wouldn't normally be an issue, but at the moment our QA server was nothing more than a single threaded developer script running as an unmanaged process. Oops.

Sure enough the complaints from QA started bubbling forth: "Test is down!", "Test isn't loading!", "Test is unusable!" To preserve harmony in the office, and to hold to our end of the developer-tester bargain, we had to do something (the sysengineer was busy rebuilding a dead Solr cluster). We needed to buy some time, so I started investigating what we could do with what we already had - maybe there was a way to handle the load with the development server.

Flask uses the Werkzeug WSGI library to manage its server, so that would be a good place to start. Sure enough, the docs state that the WSGIref wrapper accepts a parameter specifying whether or not the it should use a thread per request. Now, how to invoke this from within Flask?

This, much like with everything else in the framework, was surprisingly easy. Flask's app.run function accepts an option collection which it passes on to Werkzeug's run_simple. I updated our dev server startup script...

app.run(host="", port=8080, threaded=True)

... and then threw it back over the wall to QA. I moseyed on over to test-land shortly thereafter to discover them chipping away at a backlog of tasks, with the webserver serving away as if nothing happened.

A few weeks weeks later, the new test server is almost ready for prime time, and our multithreaded Flask server (which should never be used for any sort of production purposes) is still holding down the fort.

Running wdb alongside your development server

We've been using the wonderful wdb WSGI middleware tool at work to aid in debugging our Flask app. The features it brings to the table have helped immensely with development, and it has quickly established itself as a integral part of my toolbox.

One minor issue I had with it, though, was the fact that one needs to manually start the wdb.server.py script before launching a development server. If only I could start and stop wdb whenever Flask did. Some Google-fu introduced me to bash's signal trapping features. After some experimentation, I devised the following wrapper script.

This has been working quite reliably for us. Here's hoping it helps you!

Quick and dirty parameter passing in Angular 2; The Angling.

This is a followup to an earlier post.

After pushing that hack out to the rest of my team, I felt some modicum of pride - I had, through understanding little-used parts (at least, publicly) of a poorly-documented framework, managed to reason out a nice solution.

Days passed and the codebase expanded. One day a teammate approached me and said, "Hey, you know that trick you used to pass the ID into the controller?" "Yes," I replied cautiously. "You didn't need it."

Insert "Duck typing" joke here.


As a refresher, here's what I had ended up implementing:

<div data-ng-controller="ParentController">
    <div data-ng-repeat="childGuid in childrenGuids" data-ng-controller="ChildController" data-id="{{childGuid}}"/>

... which had some companion JavaScript in the repeated controller that read the value of data-id when the view was updated. There was no other way to pass data in to the ChildController, right? Wrong. Let's step back for a second.

Scope in Angular is special. While the framework goes out of its way to make it seem like things are tightly bound to the encompassing controller and that the DOM merely exists for presentation purposes, nothing (especially in web development) is ever that cut and dry.

Quoth the Angular documentation:

Scopes are arranged in hierarchical structure which mimic the DOM structure of the application.


The Model is scope properties; scopes are attached to DOM where scope properties are accessed through bindings.

As my coworker reminded me, the scope referenced within the ChildController isn't bound to the controller, it's bound to the DOM. And, in the case of the repeater, the scope on the repeater node includes the childGuid property.

What the code looks like now...

<div data-ng-controller="ParentController">
    <div data-ng-repeat="childGuid in childrenGuids" data-ng-controller="ChildController"/>
var ChildController = function() {
    $scope.id = $scope.childGuid;

    var init = function() {
        // do initialization


As you can see, on the second line of the controller, I'm directly referencing the repeated element. You don't need to re-assign it like I'm doing in this example, by the way, calling it directly won't/shouldn't present any issues.

An important caveat: if you think the controller might get used elsewhere, it would be beneficial to guard against instantiation without the required arguments...

var ChildController = function($exceptionHandler) {
    $scope.id = '';

    var init = function() {
        if (typeof $scope.childGuid === "undefined") {
            $exceptionHandler("The ChildController must be initialized with a childGuid in scope");
        $scope.id = $scope.childGuid;
        // continue iniitialization


And with that my hack was dead. This was a good thing as hacks are, by nature, smells that should be minimized.

The moral of the story? Write self-congratulatory blog posts about slightly-hacky solutions. Someone will submit a patch.

Capabale Chromecast Configuration (or, how I got Chromecast to work with Ubuntu)

So, I now own a Chromecast. Having acquired it while the Netflix promotion was in effect, I only ended up spending $11 on it. Not bad for a YouTube machine (the only reason I bought it since Roku still doesn't have YouTube support). I've been pretty happy with it thusfar as it turned out to be a great party tool, and the HDMI CEC support makes it really easy to throw YouTube videos on in the background, even if the TV is off.

However, one feature I was having trouble getting to work under Linux was tab casting. Using the Chromecast Chrome extension, one should be able to transmit the contents of a tab to a Chromecast on the local network. For some, reason, though, I was unable to do so. After some poking around, I determined that my firewall (ufw, the greatest misnomer of them all) was causing this. Since it was using a randomly assigned local port to establish the connection, simply opening a port wasn't the answer. Not to be deterred, I set about conquering this problem. Here's what I did.

  1. Find the Chromecast's MAC address. In my case this was a two-step operation...

    1. Launch the Chromecast app to determine the device's IP address.

      A screenshot of the IP address of a Chromecast desplayed within the Chromecast app.

      The IP address is circled in red.

    2. Ping the device and then check the address resolution stats.

      [aru@Ananke:~]$ ping -c 1
      PING ( 56(84) bytes of data.
      64 bytes from icmp_req=1 ttl=64 time=1.66 ms
      --- ping statistics ---
      1 packets transmitted, 1 received, 0% packet loss, time 0ms
      rtt min/avg/max/mdev = 1.669/1.669/1.669/0.000 ms
      [aru@Ananke:~]$ arp -a
      ? ( at d0:e7:82:7c:15:76 [ether] on wlan0
  2. Assign a static IP to the device. Configure your router to assign a static IP to the MAC address you found in the previous step. If you don't know how to do this, Port-Forward.com has some decent tutorials. You may have to restart your router after setting up this rule.

  3. Determine your local port range. In my case, since I'm using IPv4, I did the following:

    [aru@Ananke:~]$ cat /proc/sys/net/ipv4/ip_local_port_range
    32768   61000

    The two numbers are the lower and upper bounds of the local ports available for casting.

  4. Whitelist your Chromecast. Using the now-static IP address and your local port ranges, tell ufw to step off!

    [aru@Ananke:~]$ sudo ufw allow proto udp from to any port 32768:61000
    Rule added

Once you've done this, open up Chrome, click on the Chromecast button, and start tabcasting!

Chromecast extension screenshot.

As you can see, I don't use Chrome for very much.

Quick and dirty parameter passing in Angular

Don't use this code, instead, read this post and then read the follow-up.

Since my last post I've switched teams at work after completing a massive redesign of all of my old team's webapps. The overhaul saw the unification of the look and feel of the product's core webapps, with a significant amount of instrumentation and modularity to enable the less aesthetically-inclined people within the team to contribute without reducing the overall coherence of the application.

It just so happens that I joined my new team just as they were planning a massive redesign of their frontend! After assessing our users' needs and desires, we set out to redesign our stack. I'm not going to go into the full details of the transition in this post, but we settled on a Python stack running AngularJS, Flask, uWSGI and nginx (a huge improvement, development-wise, from our current JQueryUI, TurboGears, mod_wsgi and Apache system).

Prior to this redesign effort, my only experiences with client side MVC frameworks were with KnockoutJS, a great JavaScript library for manging databinding. However, when you scaled it up to include things like validation, complex objects, RESTful behavior against remote resources, and DOM manipulation, it really started to show its seams. Part of what attracted me to Angular was its similar approach to data binding, but with a stricter separation of concerns between the controller and view layers.

I could continue to wax poetic about our decision, but I'll save that for another post. I just wanted to provide some backstory for this post.

Just the other day I found myself writing a controller that instantiates new child controllers via a repeater...

<div data-ng-controller="ParentController">
    <div data-ng-repeat="childGuid in childrenGuids" data-ng-controller="ChildController" />

This worked fine. Pushing a new item to ParentController's $scope.children array would generate a new ChildController. However, I soon discovered that what I was attempting to engineer would require each ChildController to have a unique ID, generated by the ParentController prior to the child's instantiation. I attempted to use templating to my advantage in various ways, for example:

<div data-ng-controller="ParentController">
    <div data-ng-repeat="childGuid in childrenGuids" data-ng-controller="ChildController">
        <input type="hidden" value="{{childGuid}}" />

No dice. Why not try broadcasting from the parent scope after I push something to the stack? I feared race conditions. Why not a directive? It felt like too much code for something that should be simple. After reading the relevant portions of the Angular docs several times over, I threw together a nice little hack that got the job done.

<div data-ng-controller="ParentController">
    <div data-ng-repeat="childGuid in childrenGuids" data-ng-controller="ChildController" data-id="{{childGuid}}"/>
var ParentController = function() {
    $scope.childrenGuids = [];

    $scope.addChild = function() {
        var guid = newGuid();

var ChildController = function() {
    $scope.id = '';

    var init = function() {
        //finish initialization

    $attrs.$observe('id', function(value) {
        if (!$scope.id) { //defensive sanity check
            $scope.id = value;

Essentially, what I'm doing is passing the ID in via the controller declaration. However, because of the order in which Angular digests the markup, interpolation of templated attributes happens after the construction of the controller. Because of that, I had to instruct the framework to do some extra processing when the interpolation event fired and the data-id attribute was updated. After that the controller is free to finish its initialization phase.

Happy with the outcome, I pushed my changes out to my team and made a note to revisit this block when I get around to refactoring this section (it'll happen, don't worry).

EDIT: It happened.