Skip to content

Instantly share code, notes, and snippets.

@bsmithgall
Last active August 29, 2015 14:19
Show Gist options
  • Save bsmithgall/df5560a3443d8f33279b to your computer and use it in GitHub Desktop.
Save bsmithgall/df5560a3443d8f33279b to your computer and use it in GitHub Desktop.
Open Vis Notes

Welcome

Community Notes

Tweeted URLs

Keynote

Speaker: Jeffrey Heer (Twitter

Raising the Bar (Chart): The Next Generation of Visualization Tools

Visualization Tools

  • Visualization tools
    • prefuse, protoviz, vega, d3
  • "Raising the bar" -- how do we not replace but improve on current tooling
  • "How might we create visualizations?"
    • Spreadsheets --> templates (easy, low expressivity)
    • High level visualization code (ggplot dsl, for example)
    • Encoding in javascript/d3 (high barrier to entry, expressivity)
  • Platforms for visulizations
    • Charting tools
    • Visual analysis grammars
    • Visualization grammars (expressive system with a great degree of design controls, but still allows you to get the job done faster)
    • Component Architectures
    • Graphics APIs
  • How do you design a "declarative language"
    • Programming by describing what you want as opposed to how you want to do it (decouple specification from execution)
    • Vs (for example) imperative programming, where you give explicit steps
    • HTML/CSS and SQL are both examples of declarative languages (also DSLs!)
  • Why declarative languages
    • Faster iteration, less code, larger user base
    • Better visualizations with smarter defaults -- when generating vis for a larger audience, you can have smarter defaults
    • Write once and re-apply by extracting out low-level details
    • Increased performance when you decouple specification from execution
    • Write a visualization once, have it run with different renderers & inputs
    • Programmatic generation -- write programs that output visualizations; automated search and recommendation
  • Vega -- a visualization grammar
    • Data -- input data to visualization
    • Transformations -- Grouping, stats, projection, layout
    • Scales -- map data values to visual values
    • Guides -- Axes and legends visualize scales
    • Marks - Data-representative graphics
      • Graphical primitives (a lexicon of different pieces)
    • Allows organization and reasoning about the structure of the visualization
    • VegaLite
      • A formal for statistical graphics; includes data transformation and encoding; compiles to vega spec
    • Polestar -- an interface for interacting with vegalite
    • DOES THIS TRANSPILE TO D3?! Yes!
    • Because they have the same ecosystem, can we run down the toolchain
    • Voyager -- reduce tedious manual specification
      • Support early-stage data exploration
      • Encourage data coverage
      • Discourage premature fixation
    • Browse a gallery of visualizations -- a challegne is the combanatorial explosion of views!
  • It's not that one tool should rule them all, but that the ecosystem should support tools working together
  • What about interaction?
    • Treat user input as first-class streaming data -- it's another source of data that streams in, so you need a technique of processing this streaming source
    • Vega 2.0 -- single declarative model for specifying visual encoding + interaction techniques
  • Open Challenges
    • Designing interactions interactively?
    • Enhancing the "gallery" experience -- overwhelmed by too many graphics; how do we make it easier to read multiple charts. What about embedding small views in large spaces?
    • Improving visualization recommenders -- how do we learn from users over time? Are they different from different data domains?
    • Debugging -- as you go up levels of abstractions, they leak when something doesn't work the way it should. How can we use visualizatinos of the actual specification to do debugging?
  • Questions -- what about fishing?
    • People do this regularly in all existing visualization tools
    • How can a tool support handling this sort of thing?
    • How does designing a recommendation system help with this? The bigger problem may be that early fixation is the major problem
  • Does this output d3?
    • Theoretically, this is possible. This has a lot to do with how data transformation is handled.
    • It's really difficult to do this because you end up with mostly custom data flows

Learning while making p5.js

Speaker: Lauren McCarthy (Twitter)

hello.p5js.org

Notes

  • p5js
    • Wow p5 is really awesome! This is super amazing. Let's play with it; maybe try to do an ignite talk based on it.
      • Audio nose capture ellipse thing was amazing!
      • This is super neat.
    • p5 contributors conference at the end of May.

So You Think You Can Scroll

Speaker: Jim Vallandingham (Twitter, Slides)

Demo

Notes

  • "Scrolling is a continuation; clicking is a decision"
  • Scrolling dimension
    • Use scrolling to convey distance
  • Scrolling triggers
  • Scrolling steps
    • Scrolling used to direct a story
  • Scrolling continuous
  • Interactive storytelling
  • Scroll events
    • getBoundingClientRec() -- gives value of a rect relative to the viewport
    • window.pageYOffset -- gives the location of the native scroll
  • Mike Bostock has a "how to scroll" article -- don't scrolljack
  • Jesper Kiledal - Is Up Really Up? There are two mental models:
    • Movement of viewport on a fixed document
    • Moving text model, where the window is fixed but the document moves up and down

A bit about touch

Speaker: Ramik Sadana

Notes

  • Three scales for touch: mobile (7"), tablet (10"), large touchscreens (>10")
    • All have very different contexts
  • Focus on challenge of tablets
    • Design in the digital age should have three properties: usefulness, usability, desirability
    • Usefulness: A product's clarity of its own content and purpose (Who, Why, When, Where, How)
    • Usability:
      • Guessability
      • Learnability
        • Complexity has many components including duration, distance, fingers, taps, etc.
        • Tap & pan -- people didn't like it because it both not guessable and hard to learn
        • How can you discover things? In information visualization, this is actually not that big of a deal
        • Expected action - the user expects that some guestures will produce a certain response
      • Affordance: cues for guiding interaction
    • Desirability: you must want to use the product to make it worthwhile

That's The Power of Loops

Speaker: Lena Groeger

Slides

Notes

  • Gifs!
  • Modern GIF usage:
    • Reaction gifs
    • Mashup gifs, which combine
    • Breaking news gifs
  • The history of looped images (gifs in the past)
    • Early 1800s, there were devices that created the illusion of motion
  • The power of the gif is that it is repeated in time over and over, forever
  • Infinite looping makes gifs a powerful tool
  • Why are gifs useful?
    • Loops that exist already today
    • Loops that explain a process
      • Here's how a cheetah runs, here's how a lock works, here's how a sewing machine works
    • Loops that explain a concept
      • CSS positioning, sorting algorithms
    • Show probability and chance
  • What happens when repitition is looped?
    • Exposure - we like things that we've been exposed to before
    • Shifting attention - first we listen to the melody, then maybe drums/lyrics/guitar
    • Memorization -
  • Imagine if we had a gif to perform, say, CPR as opposed to a list of instructions
    • Order, steps, and visual information are all encoded in the loop
  • Transformation
    • Let's mash up "sometimes behave so strangely w/p5.js"!
    • Words -> music by repeating them in a loop

WebGL for Graphics and Data Visualization

Speaker: Nicolas Garcia Belmonte

Notes

  • What is WebGL?
    • API to access the web GPU
    • What can Web GL be used for?
      • Real-time, exploration, storytelling, scientific, data art/illustration
    • Computations in JS is the bottleneck for webgl, as opposed to rendering
    • You could use WebGL to do real-time analysis on top of sound, for example
  • How does WebGL work?
    • Store an array of floats into a buffer which can be accessed by the GPU and then rendered
    • You can also write a vertex shader/fragment shader on a rendering pipeline
      • Vertex shader is used to rotate/scale points (1:1)
      • Fragment shader decides which color each pixel in the screen should be
    • WebGL pipeline
      • Vertex shader -> Triangle assembly -> Rasterization -> Fragment Shader
    • GLSL - A DSL for graphics
      • A c-like language with built-in types, functions, or graphics
        • Vectors, matrices, math functions, reflect, refract (functions for graphics)
      • Operator overloading
        • You can do operations between different types and it will work as you expect
  • Hopf Fibration
    • A way to explore 4-dimensional shapes. You try to project them into 3D
    • Hopf function maps a point in a 3D sphere to a circle in a 4D sphere. A nice way to explore a 4D shape; you can imagine a point in a sphere. How do you project the circle from 4D back to 3D. You then project it back. You can use a stereographic projection to preserve the circles. If you use a 4D stereographic projection --> 3D, you end up with a 3D circle.
    • In practice
    • Data Model
      • Ideally, we would want an array of [lat, lng] --> [lat1, lng1, lat2, lon2, ...]
      • You want to make up a circle, though, so you need to send over a third argument, the positionin (from 0..2π)
      • Send an array from 0 --> 2π, it comes back as a cirlce
    • Interactions
      • This is a pain point on the WebGL
      • WebGL is rendering to a bunch of pixels, so don't actually know anything about the third dimension
      • What we do is render over a sphere that has a texture of one different color per pixel; every color will then map uniquely to a position in a sphere. It is then easy to map from the 3 dimensions of color (R,G,B) back to the lat,lng. This is called colorpicking.
  • Key takeaways
    • Uses the GPU for speed and scale
    • Uses from data art and exploratory vis
    • The API is very low-level, so three.js, or others are a good place to start

Mapping the Cosmos: Visualizing Millions of Objects in Space

Speaker: Ian Webster

Notes

  • There are ~2,000 objects that might collide with earth (woohoo!)
  • There are ~750k asteroids in space, and we want to know which asteroids are going to swing by the earth, which are candidates for missions, etc?
  • NASA JPL has a powerful tool for someone who has a lot of background knowledge. There is also the minor planet center, which is a space-delimited text file.
  • Asterank
    • Ranked query results for arbitrary questions about small bodies in space (asteroids)
    • Let the public engage with space
    • Gather data
      • Scrape JPL, MPC, others
    • Calculate
      • Composition, mission costs, etc.
      • Delta V -- the amount of energy that you need to get from point a to point b in space
    • Combine
      • Finally put it somewhere else
      • WebGL to create a visualization of thing. Let people go in and view the information in context.
    • "The data doesn't mean anything to people if it's not something you can reach out and play with"
  • Web workers
    • When your JS is executing, you are blocking the UI. This means that you are blocking the visualizations. You can instead create workers to handle processing, and send messages back and forth.
    • Serializing messages is very costly, aim to reduce it. You want to send all the data up front, and only send back results
    • One worker is usually enough. There aren't linear returns for web workers.
    • Web workers cannot touch the DOM, so you don't get things like console.log()
    • Not supported on all browsers
  • Timed array processing or chunking
    • Instead, break up the array into smaller chunks until you are either done with your chunk or out of time. Then you set a very short timeout, which will allow the browser to use reflow, for example
    • Eventually, though, you will probably need to go to the GPU
  • Very tempting to represent each point as a particle, but you should instead defer to a ParticleSystem, which is a single geometry with many shifting vertices
    • Put as much as everything in the GPU as possible.
    • There are a lot of gotchas, make sure to test on all sorts of browsers and operating systems.
      • On windows, DirectX creates another layer that is difficult.
      • Monitor FPS and gracefully degrade; turn on/off components that can keep the user at constant 60fps.
  • Displaying lots of static data
    • Group particles (R Tree)
    • Reduce everything to a visible subset -- for example, take two galaxies that are close together and merge them into one with comprable luminosity.
    • Adjust particle size and adjust shaders on the go. Checking on FPS, where you are in the universe/visualization, etc.
    • Preserve the visual identity without compromising the performance

Using humor to inform

Speaker: Nigel Holmes

Notes

  • Context is what matters
  • Seldom does your audience know everything

The Human Side to being a Digital Practitioner

Speaker: Jono Brandel

Notes

  • The advent of sofware muddles the dichotamy between utility and expression
  • Expression, Utility, and Software
  • More of a fun talk, but patatap is cool

Blindfolded Cartography

Speaker: Andy Woodruff

Notes

  • Traditional cartography is all about finding a particular story that finds in the data that you have. With modern web tech, you might know about the data, but you certainly don't know the data in the same way
  • It might also change in the future if they add/remove something
  • We look for a number of compromises in the area of their experience
    • How do you balance good design with unknowns
    • How do you design well if you don't know what you are designing
  • Data Classification (especially binning data)
    • Data distribution in design
      • We design around a nice normal-ish distribution
      • Outliers cause problems (every time)
    • A classification scheme needs to work for any distribution without intervention
      • Considerations:
        • Will the chart/map be useful and look good
        • Are the class breaks meaningful -- does it have any real-world meaning?
        • Are the breaks understandable -- does the map user get how the numbers were derived?
        • Are the breaks nice numbers (400 vs. 402.3785)?
      • Common methods:
        • Jenks optimal breaks -- maximize similarity within groups and difference betweens groups
        • Quantile breaks -- same number of items in each bin
        • Compromise: pull high-end into its own class
        • Compromise: do quintiles based on unique values
  • Density
    • Make sure small things are on top of big things
    • Take advantage of known geographies (states zoom to extents, for example)
  • null
    • "No data" is data! Explicitly design for them
    • No data can come in a variety of forms
    • Zero is not null!
    • Texture can be a good way of handling null on a map; grays can get lost depending on map color scheme
    • Time series can also have gaps -- don't try to interpolate when you don't have two points to interpolate between
  • Text
    • If you truncate with ellipses, you have to be mindful of the different properties; Congressional Districts have a number which should probably be shown, for example
    • Mind your number formatting (one vs. multiple, thousand vs. million)!
  • The Lorem Ipsum Map
    • Real/realistic data is not always available to designers/developers
    • Designing UI around Lorem Ipsum can be detrimental to the overall look and feel of the user experience
    • Fake data can also be a test of code, because if your map can work with terrible data, it can work with real data. The worse your fake data, the better
  • How do you deal with "big design"
    • Prioritize areas you care about
    • Prioritize your most common areas
    • Also look at some weird edge cases

Storytelling and Data: Why? How? When?

Speaker: Robert Kosara

Notes

  • We tell stories because we get people's attention
  • We tell stories because of our memories; stickiness/memory is important when you are doing something that is a bit unusual, especially when you are doing communication.
    • The way memory works is not like a computer; we don't just store and retrieve information
    • "You have to remember something to remember something else"
    • We develop more complex language to express more complex stories
  • How can we tell stories, especially about data?
    • Techniques
      • There are techniques that are specifically about storytelling
      • Connected scatterplot -- you can turn a small number of datapoints into a story with good techniques
        • This doesn't work in many cases, however; you get many awful hairballs
        • Works for some things very well, not at all for other things
      • ISOTYPE charts -- use little icons to represent multiples
      • It can be very valuable to do very specific things; we don't necessarily have to make everything as general as possible
    • Structure (low level structure)
      • You remeber events over continuation -- so events are still important to keep in mind
      • Cognative psycology backs up this bias towards events
      • Looking at steps reminds us of comics/comic strips, so how can we emulate comics creation (the sequential art) in graph creation
      • Scott McCloud "Understanding Comics", Neil Cohn "Visual Narrative Structure"
      • Examples:
        • EIPR Model
          • Frame 1: Establisher -- tells you where you are
          • Frame 2: Initial -- start of the action
          • Frame 3: Peak -- Where the main thing happens
          • Frame 4: Release
  • When do you tell stories?
    • Zeigarnik Effect -- we remember things that are unfinished
      • Waiters remember orders for large groups until they are paid for; the story ends then

Keynote

Speaker: Santiago Ortiz

Notes

  • Value of the connection protocol via iframes; you can transmit data back and forth across independent modules
  • Visual programming language is still coding

The Power of the Reveal

Speaker: Hannah Fairfield

Notes

  • Using storytelling to reveal information
    • Start with building blocks -- and build where you want to go
    • Test your crazy ideas
  • Choosing/finding the right form to match your data is both difficult and rewarding
    • Layering information sets the path for discovery (especially given design for mobile)
    • Sometimes using hand sketching can be helpful if there is familiarity with the data; otherwise outside analysis can be needed
      • Too early in the process -- doing what you think it is going to look like is very dangerous
      • If you start and end with a scatterplot, you maybe miss something
  • Beware the false reveal
  • Editing to reveal on mobile is harder than it looks
    • Takes a lot longer; requires a lot more labor
  • Doing it live - elections & olympics
    • Have enough people on the team with specialties, so you can portion tasks out appropriately
  • Goofy is good (spot the ball NYTVis for example)
    • Goofy + thinky is even better (who needs a gps)

Interactive DataVis with React: Taming the Complexity of the Changing State

Speaker: Ilya Boyandin

Notes

  • Part of a long process to find a sustainable process for medium/large interactive visualization projects
  • Choosing a scalable architecture -- something enabled by react
  • We want to tell people what we know, and the media that we choose is very important
  • Take advantage of interactivity -- take advantage of faceted exploration, personal stories, and engagment with end users
    • Not enough pixels to show all data the same time, so you have to simplify things; how do you reduce this in a meaningful way
  • Why is interactivity so hard? Because we add many states whenever we add interactive features -- the more states you have, the more transitions you have. Ever state and every transition needs to be modeled.
  • Grammar of graphics --> statistical graphics can be expressed declaratively from some data model to a visual object (see: Vega, ggplot, bokeh, etc.)
    • If you have a complex application with many interconnected components, can you apply this same model? Not directly
  • What does D3 have to offer for re-usable applications?
    • D3 reusable charts --> allows packaging of an individual component
    • It doesn't tell us how to compose components, however
    • When you have many interdependant components, it is very difficult to manage this consistently
    • MVC is the industry standard for handling this. Data in the model, view on the screen, and controller to handle the data flow
      • For data vis on the web, D3 can live in the view
      • BUT, this doesn't solve the problem because you have many complex dependencies of views to models and controllers. You also have specific view models, which scatters state across models, so there is no single source of truth
  • What is a better way?
    • Get the app state right
    • Define how any state is rendered --> map application state to the representation. This is a much simpler model. Data is mapped directly to DOM objects, a la grammar of graphics
    • Why haven't we been doing this all along?
      • Costly full render on state change
      • React can provide a solution to this via its virtual DOM. Virtual DOM is easier (less costly) to manipulate, because you avoid browser reflow
    • React: 1. keep track of previous state 2. virtual DOM changes state 3. react runs a hueristic to identify the minimal set of changes to mutate the actual DOM 4. DOM is updated to reflect most recent state
      • This makes full render affordable and a possible solution to this problem
    • React component render functions have a virtual DOM. Render method could, for example, create an SVG document. This allows directly mapping
      • You can actually use many of the d3 functions inside of the render function; use them
  • Animation
    • This is where react does not excel very much.
    • Doing something like particle modeling where you care a lot about the state during the animation is a bad use case
    • However, if you catch the animation start and stop points, you can still control the whole application
    • Note -- facebook is working on tween states with react so this should hopefully improve in the future.
  • Interactivity
    • You keep track of properties via getters and setters. On setter call, you render the application components
    • Compose a large application out of components which are functions
  • For more serious apps, there is a pattern proposed called "Flux" -- actions hit a dispatcher, which updates data stores, which then flow down to the views. This idea is similar to the data flow; everything goes in the same direction. (Flux notes)
  • Structuring your application with immutable data types allows for quick comparison of complex understanding
  • Server-side rendering
    • Most visualizations use javascript to render into the DOM. These produce something invisible to search engines
    • However, because react uses virtual DOM, you can use the same code you use on the client to render on the server into a string.
  • React canvas, React native
    • Other implementations of react that use other targets
      • Levers for rendering into canvas
      • Native allows development of iOS/Java apps (no graphics, though)
      • Also some experiments for WebGL (not particuarly promising)
    • You could theoretically use the same SVG app to render to SVG/native/other primitives
  • React developer tools
    • Extension for chrome which throws your virtual dom
  • Hot code reloading -- keep same state but re-render. Good for react
  • Article "the future of javascript mvcs" -- compare react, backbone, etc.
  • React enables an architecture that is:
    • Easy to reason about
    • Scales to large applications
    • Performs well when you are careful
    • Fun to use
  • How do you test?
    • The components are functions, so you should be able to assert certain inputs create outputs.

Data Visualization on a Deadline

Speaker: Alyson Hurt

Notes

  • NPR Tools -- what we're starting with
    • Everyone runs OSX, hosting on S3
    • Flask app locally to render content, flat files are then backed to S3. Less long-term maintenance, stands up better to heavy traffic
    • Some python development, but most things are done with js and D3
    • All development environments are the same
    • Version control
    • Styleguide -- have agreed upon best practices/code styles. Color & type standards should be included and baked in
    • Starter code -- have a consistent starter point
      • Two main things: large standalone projects, internal CMS
      • internal CMS has everything in the same repo, same ecosystem
  • Common design patterns
    • Lock the aspect ratio so everythign stays the same
    • Tables collapse into rows
  • Google spreadsheets as a mini CMS
    • Key/value pairs
    • Try to collect everything from the project in the google spreadsheet
  • Open source -- separate code vs. content. Keeping everything in spreadsheets keeps this separation
  • Deployment
    • fabric, generates static files

User-Centered Visualization Research

Speaker: Lane Harrison

Notes

  • Exploratory vs expository data visualization -- don't know what you are looking for vs. knowing what you are looking for
  • Both demand "accurate" and "precise" visualization
  • How do you make an optimal design decision?
  • Leveraging perception to improve visualization design
    • Many techniques that work for the same data
    • Model based evaluation
      • Instead of comparative evaluation, you could model perfomance of a certain visualization on a given task
      • You can make strong claims about correlation
    • Test charts that are "one click away" in Excel
    • Results
      • 200,000 perceptual judgements
      • The perception of correlation can be measured using Weber's law
  • You can do a lot of good with good models
    • Dependant on correlation of your variables, you can choose different techniques
  • Explore properties of the conceptual space of correlation
    • Symmetries and asymmetries have effects on how your charts are read
  • Theory grounded model can build the science of visualization while providing real information of visual design
  • People do a lot more than just perceive, though, so you have more to worry about
    • How can you quantify impact of individual differences?
    • Does emotion impact graphical perception?
      • The more difficult the task, there is more clear separation over time -- people who are primed negatively perform much worse on a task that is difficult
      • Discussion with a cognitive psycologist -- positve moods can expand the perceptual spotlight of attention, and allow processing of a larger spatial error
      • Negative moods can constrict spatial areas of analysis
  • Example -- lack of understanding of conditional probability is a cause for overtreatment. But how do you visual this information? You have the human on one side and the system on the other. This is problematic; how do we square this circle?

Weighing performance against pain: SVG, Canvas and WebGL

Speaker: Dominikus Baur (Twitter, Slides, Video)

Notes

  • The more you want to squeeze out of the browser, the more painful
  • You don't need to worry about performance too much when you are worried about bar charts
  • Better life index: one flower per country; leaves per flower, and each leaf depicts different metrics
    • Current implementation is one canvas element per flower
    • This makes the code very tricky
    • Moving to WebGL actually made things easier in canvas
      • Fewer elements, fewer conditionals
    • On broadway -- create a sense of the city by using a bunch of different layers
  • Performance basics
    • Know when to stop
      • Performance optimization is a fickle mistress -- sometimes there is too much data and there is no improvement
      • Know your tools
        • Chrome/safari/firebug tooling -- zoom into one line of JS that screws with your performance
      • Remove things
        • Leave out all the unnecessary stuff (console.log, elements you actually need, css patterns)
          • Gradients have high performance costs, for example
        • Throttle re-rendering -- will only call the function once in a certain time window
      • Shift things
        • If you can't get rid of stuff, try to shift them elsewhere (webworkers)
      • Once you are deep into performance optimization, you have to replace the browser's basics, but are you really sure you want to do that? Browsers are pretty smart technologies.
        • As you move towards the GPU, you have to handle a lot of current browser functions
  • Why is stuff that is close to the browser slower than other things? Probably because of the DOM
    • Reflow: Layout of the whole page (where elements are, sizes, etc.)
    • Repain: Website is dedrawn
    • On initialization, you geta reflow and a repaint
      • If you change width/height you trigger reflows
      • Asking for width and height also triggers reflows (two of them)!
      • Better solutions:
        • Detach certain parts of the DOM so that you can remove whole node structures, and then re-attach them. This saves reflows (one for detach, one for append) vs on the fly calculations
  • Dynamic Graphics
    • SVG -- slowest. Mostly in browserland
      • Vector-based (as opposed to canvas/WebGl pixels), which means that it supports all screen sizes, retina displays out of the box
      • Supports CSS, including deep support on mobile
      • DOM integration. Useful because it works like a scenegraph -- heirarchical tree of elements and once we manipulate a parent element, children elements also get dragged along
      • Static SVG: 10,000 elements. Animated SVG: < 1,000 elements (maybe half of this on mobile)
    • Canvas
      • "Immediate mode" -- process input, update game state, render, wait for events
      • requestAnimationFrame() --> calls a callback function when the browser has the time to do so. Write a visloop that checks input, adjsts vis, renders, and then call the visloop again when possible
      • You have to do everything yourself, though. To the browser, canvas looks like a browser. You can either project mouse coordinates or map elements to colors, and then do colorpicking
      • Big plus is performance over SVG
    • WebGL
      • Going to the very basics of computer graphics -- graphics card is a massively parallel set of triangles, and that's it. You can put textures on these triangles and the browser will display this for you. Changing the texture gives you animations, etc.
      • Lots of pain around GLSL, but three.js and pixi.js make it a bit easier to work with. three.js is like the D3 of WebGL
      • "You can also have that in your browser if you are brave"
      • You have to know what you are doing, and if it is worth it.
      • pixi.js (2D WebGL, much easier to work with than three.js)
    • Hybrid approaches
      • You can also combine approaches (see Times yield curve, for example)

Why Exploring Big Data Is Hard (and What We Can Do About It)

Speaker: Danyel Fisher

Notes

  • How does working with larger data sets differ from exploratory analysis?
    • Cases where the size of the dataset is part of the problem to be solved
    • Why is "big data" challenging -- the representation problem and the interaction problem
    • There is also monetary costs associated with this processing
    • 10^6 pixels on screen space, 10^9 bytes in memory, 10^12 bytes on the hard disc, so there is some conversation that needs to happen around aggregation
  • Aggregation
    • What is the aggregate form of a scatterplot? Bar chart? Etc.
      • Some things aggregate well, but choosing the aggregation is a useful way to think about what the visualization is trying to show you
    • "Generalized histogram"
      • Select buckets on data, examine points, then create shapes based on buckets
      • "Walt the hypothetical histogram"
        • Decide what operations will support raditidy, and which operations are going to be slow
  • Sampling -- many people make many mistakes
    • People cut off queries early and are fairly happy with approximate results

What is the role of visualization in prediction?

Speaker: Adam Perer

Notes

  • Cohort construction
    • Population/group of individuals
    • Build groups
  • Feature construction
    • In construction, you want to contain as many features about your individual inputs as possible
  • Cross validation
    • Fold data many ways; many cross-validation sets
  • Feature selection
  • Classification
  • What about visualization
    • Ecosystem of tools that touch upon the parts of the predictive modeling pipeline; outputs of the pipeline
  • Coquito (cohort queries)
    • Visual query builder where you can add constraints to eventually narrow down to the population that you would want
    • Create a sequence of filters
  • Infuse (interactive feature selection)
    • What feature selection algorithms do you use? There are some answers, but often there aren't good answers to understand why they are using different algorithms (same is true for classification algorithms)
    • How do you inject domain expertise?

Motion Design with CSS

Speaker: Val Head

Notes

  • To do animations, you give it a list of keyframes
    • The biggest limit is your willingness to type them
  • Advantages to using CSS:
    • Low barrier to entry
    • Reusableness
    • Performance (the advantages are worthwhile if you do things smartly and not modify things that are naturally expensive, for example)
  • CSS is good at:
    • Predermined motion
    • Animating between two states (hover/unhover, keyframes)
    • Reusable motion
  • Less good at
    • Dynamic motion
    • Multi-style animation
    • Physics -- CSS is a markup language, not a programming language
  • Animation can be a strong tool, especially if it has purpose and style
  • Animation principles
    • 12 principles
      • Timing
        • If there is no other principle of animation, this is the one to focus on
        • "Appearing to obey the laws of physics"
        • Establishes mood, emotion, and reaction
      • Follow-through
        • Cubic Beziers are a good easing function
      • Secondary action -- supplemental action to the main action that reinforces and adds dimension
  • Even better browser tools for CSS animations

Building Earth: an animated map of global weather

Speaker: Cameron Beccario

Notes

  • Earth -- a global animation of weather conditions that runs in the browser

  • Big thanks to jon for the notes template

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment