As we take the lid off of Haiku Deck Zuru for more users to experience, we know some will be curious as to the inner workings of the app. If you're interested in what's going on inside of the machine, here's a quick overview.
Step 1: Parsing and Data Structuring
Types of artificial intelligence used: natural language processing
The presentation transformation process begins with an uploaded PowerPoint or Keynote file, or with an outline or Wikipedia article.
First, Haiku Deck Zuru’s parsing engine extracts the text, converting it from a binary file into structured data that can be analyzed (and, ultimately, transformed).
In the case of presentations and Wikipedia, Zuru looks for and extracts any custom images, analyzing where the images are placed on each slide and assessing importance by calculating how much of the slide area each image takes up. If an important custom image is identified, Zuru centers and aligns it to make it fit well with the visual style of Haiku Deck.
At this stage, Zuru also uses natural language processing to strip words down to their most meaningful roots, remove duplicates, standardize text, and identify meaningful compound words (for example, “Space Needle” instead of “space” and “needle”). This processing stage is critical for identifying appropriate images later on.
Next, Zuru looks at how the text is laid out, noting patterns such as list items, titles and subtitles, headers and sub headers, and so on. It analyzes the size and placement of text to intelligently identify footer text, and to zero in on which text is important and which is not.
Zuru also analyzes the text to evaluate at a high level how well it follows presentation best practices, and to identify presentations that will require more manual intervention. For example, are there too many bullets, or too many words in general?
Step 2: Keyword Analysis
Types of artificial intelligence used: machine learning
This is where the artificial intelligence behind Zuru gets really interesting, as Zuru uses the (anonymous) data it has gathered from the millions of presentations created by Haiku Deck users to suggest a beautiful, relevant image for each slide.
Using this giant data set (which had to be crunched for 36 hours using a linear regression model), Zuru looks at how frequently each possible keyword on a slide appears in presentations, vs. how frequently it has been used as an image search term and actually selected as an image result.
- “Love” appears very frequently in presentations, but it’s not frequently used as the image search term. So when Zuru sees “love” on a slide, it is less likely to select that word for the image search.
- “Dog” appears frequently, and is frequently chosen. So when Zuru sees “dog” on a slide, it is more likely to select that word for the image search.
- “Space Needle” does not appear frequently, but is frequently chosen. So when Zuru sees “Space Needle” on a slide, it is very likely to select that phrase for the image search (and, if you remember from step 1, it’s smart enough to look for the iconic structure instead of pictures of stars or sewing).
Zuru also takes into consideration where a given keyword appears in the slide (for example, a word appearing in the header is likely more important than a word appearing in the 5th bullet), how frequently it appears on the slide, and how frequently a given word has appeared in other slides. All these calculations help determine the best term or terms to use for an image search.
Step 3: Image Selection
Next, Zuru does an image search for the most highly ranked keywords, returning hundreds of extremely high-quality, Creative Commons licensed images in a matter of milliseconds. We have carefully designed this step to be able to handle massive amounts of image searches very quickly, in parallel, so that it feels instantaneous to the user.
Because of all the natural language process and normalization we’ve done already (see step 1), as well as our robust proprietary data set (see step 2), more than 90% of image search terms appear in our list of the 70,000 most popular image search terms, which allows Zuru to suggest standout images incredibly quickly and accurately.
Images that have been hand-curated by our team, or that have been selected frequently by Haiku Deck users, rise to the top — and, of course, Zuru will continue to get smarter with every new presentation created.
Step 4: Image Optimization
Types of artificial intelligence used: computer vision, K-means clustering
Once Zuru has identified the perfect image for a slide background, it converts the image into a Lab color space, a format closely mapped to the way that humans perceive light. Zuru uses a process called K-means clustering to analyze the color palette, removing the grayscales and zeroing in on the colors that appear most frequently in the image. Zuru pulls out the most prominent colors in the image and performs brightness analysis on them, comparing them against hundreds of professionally designed color palettes to select the ideal colors for the slide foreground and background.
Next, Zuru performs brightness analysis to determine the best placement of the text (top, middle, or bottom), and whether a text screen is needed for optimal readability and contrast.
Zuru will also look for faces in your background image and try to position your text so that it's not covering a subject's face.
Step 5: Finishing Touches
Our goal with Zuru is to use data and artificial intelligence to get presentations 90% of the way there in a matter of minutes, and to make it very easy for the presentation creator to review and fine-tune the results.
In this final stage, paid subscribers get the opportunity to edit their presentation in Haiku Deck or download it for editing offline in a traditional presentation app.
Sound complicated? Well, it is. The good news is that although Zuru is complex behind the scenes, it’s incredibly important to all of us that the user experience is simple. We hope you agree, but let us know how we can improve it for you!