Scala at Twitter at E2E SF Tech Talks
Ever since the SDForum panel on emerging languages I’ve been very intrigued by the Scala programming language but have not been able to invest some time in a project to try it out. I got as far as setting up Eclipse with Scala and Lift but got no further. When I heard about the Engineer-to-Engineer Tech Talk, hosted at Redfin, about the use of Scala at Twitter, I made every effort to attend. The presentation was at Redfin on April 30, 2010 given by Alex Payne, who, at the time, was at Twitter and was one of the primary advocates of the introduction of Scala at Twitter. The talk really focused mostly on the good points of Scala and some of the reasons not to use Scala were possibly underplayed.
The Redfin blog also has a post about the E2E talk about Scala at Twitter.
The following are my raw notes from the event.
At Twitter since 2007
Built out the platform for developers
Works om infrastructure
Spearheaded the introduction of scala
Co-authored Programming Scala
Why Scala? You already have Ruby, right?
Yes, works great for many things
Particular sweet spot, front end Rails
Pain point, long running, computation heavy processes
How to introduce another language that we enjoyed working in as much?
Evolution of ruby is pretty slow, jruby has advantages
Problems with garbage collection
It’s hard to keep ruby up and running for long periods of time
Since ruby works in so many different architectures it doesn’t work REALLY well on any one.
Have a daemon that monitors and restarts ruby processes
Java* productive t not sexy like ruby
OCaml – very few production uses
Erlang – lot of stiff being built in Erlang and building very specific types of infrastructure, io based stuff but the KB around the language was not great Nd an inaccessible user base
Not impossible for me mortals top learn.
Accessible developers – burned by Ruby development not being accessible; Ruby wanted to focus on fun and not enterprise. Understand concerns but won’t address them.
How Scala fits
It’s fast – at least as fast as Java, usually; do more with fewer machines
Can borrow libraries from java with no performance hit
Runs on the stable, tunable JVM
Great community, small but growing and the developers of Scala are accessible
You can start with Java like stuff and move on to more Haskell like functional programmaing.. The Lift framework is a good codebase for functional programming with Scala.
OOP and FP, together (has a thesis behind it a specific intellectual problem)
Rich static type system with inference, does more than the java type system, can bound the types
Flexible DSL-friendly syntax
Traits (kin to modules in Ruby), i.e. aspect oriented programming or cross cutting concerns; can throw in as many traits as you like into a class or object
Choice between mutability and immutability, in Erlang everything has to be immutable & scala gives you a choice which is good and bad. 9 of 10 times y ou want immutablity but it is nice to use when you need it
Optional laziness, wont assign value until the last moment in which you need it
Pattewrn matching, case statements with pattern matching, can match on the type or items in an array (wow, this sounds really neat)
XML literals, can throw in XML like a string with XPath like capabilities; don’t need a SAX lib
The Concurrency Story
Actors are great but they are just one solution, model concurrency around messages sent to actors, s lot Ike a queuing system, works great except when it doesn’t and you need threads, a library implemented in the language
Threads are available
So is asynchronous, event-driven networking
So is Netty and Apache Mina
People even roll their own Actor and STM libraries, feels like a natural extension of the language
Java Interop is a Big Win
More so than er tahought, even
Relatively easy to use Scala with tools Ike Hadoop
Trivial to make use of Thrift-generate Java
Thousands of battle tested library
Lowers learning curve
How is Scala used?
SOA is a new idea to the Web 2.0 world
Isolated, independently developed and tested
Our Scala powered services
kestrel for queuing
Flock is a set database for social graph storage, not a graph database like , 20k operations per second
Hawkwind is people search
Hosebird powers the streaming API, Twitter sells streams to Google, Yahoo, Bing; low latency, was not able to find equivalent that performed as well
More being built all the time…
Been good but looking at something new called avro from hadoopu
You define your data structures and methods
It generates code in a variety of languages
Takes care of the tricky networking bits (mostly)
Provides backward compatibility a your system evolves.
Has worked great, but wea re looking at Avro too.
Scala SOA pros and cons
Pro, easy to spin up small teams to tackle big problems. Two guys can build a name search in a couple of months
Con, separate codd base mean potential overlaps
Pro, thrift plus best practices means fast, reliable services
Con, every project requires slightly different operation approaches
Understand the tradeoffs.
Tool. Making it easier to work in scala
IDE’s? Who Needs ‘Em?
We’ve tried them all. On any given day, one is better than the other
Moat of he use a traditional text editor
Intellij IDEA isn’t too bad, if y ou can’t five up your IDE
scala removes the need for an IDE
Sbt modes/plugins make for a fast, sane workflow
Alex is an IDE
sbt – simple build tool
Scala answer to ant and maven, it actually can bring in your ant and maven build scripts
Sets up new projects
Manages dependencies and build tasks
Interactive console mode (using the scala REPL)
Can automatically compile and run and task, run your tests as soon as you save your code!
(Seems to really like this tool)
Uses the FSC, fast scala compiler
BDD testing, the Scala way
Readable, flexible syntax
Tons of marchers for all types of objects
Supports several different mocking libraries
Everything you expect from a testing tool and more
Great maintainers, very responsive
Ostrich (fork me on github)
Do tons of stats gathering and didn’t do this before. Stats gathering helped stability at Twitter. setting metrics and trying to improve
In-process statistics gathering
Provides counters, gauges, and timings
Share gathered stats via JMX, JSON over HTTP, plain text Telnet-style, Scribe, etc
Simple to use and easy to integrate
Configuration files and logging
A flexible file format that handles include, inheritance, variable substitution
Tunable logging with Scribe support
Subscription API push and validate changes
Even and Emacs mode
Scala has a cool language feature that makes it easy to maintain
A set of extensions to specs
Test concurrent operations
Freeze and unfreeze time
Manage tempo ray resource
Now integrated into specs itself
A better JSON codec
Uses parser combinators, another really cool esoteric fearer of Scala
Lots of extra test cases
Battled tested in production use
Scala has triple quote notation like python
Other Twitter scala libraries
Naggati protocol build for Apache Mina
Smile is an actor powered memcached client
Querulous is a nice SQL database client
Jackhammear, a load testing framework, in early stages
Probably more, check out github, al3x
Style guides on our internal wiki, good and bad, it will become like C++ in 10 years, need a style guideline. (Can they release this? Or some version of it?)
Code reviews. Doesn’t go into master branch unless it is reviewed.
Good developer can learn any language
Stairway book is good for reference and example.
Google the “seductions of scala”
Where is scala at? (This section is old info).
Stable 2.7 series what twitter uses in production. Has some issues with collections
Upcoming 2.8 release tons of great changes, should probably be 3.0
Scala Summit coming up during OSCON in July in Portland
Growing community, more companies, more books, more bloggers, more training
Solid organizational direction
Daily scala, planet scala
@al3x for the style guide
Did not identify easy to learn as a scala quality.
Interested in the long term so another language might be more productive but it helps grow your education and growth. Lots of room for a developer to grow with this language. Scala let’s you evolve in that way.
GC is frustration but no more so than Java. Some applications require JVM tuning and scala does change the tuning
Are there ways for Java to call back into Java?
You start with Java-like scala but as you get into more Scala-esque code, then it is more difficult. In Clojure, the philosophy is that you call Java but Java does not call into you.
Is there a web framework for Scala?
Lift is one of the longest living projects in Scala and has helped pushed Scala forward. Lift is established and robust. Don’t use it because it is a little heavyweight. David Pollack approached testing scala from the compiler will do most of the testing. There is a GSC to backfill the test coverage.
Twitter uses Jetty and has some stuff that wraps it up, just 20 lines of code.
How do you do debugging?
Same Java tools. Stack traces are a little funny looking at first.
How does Scala call into Java, does it look ugly?
Thinks that it makes Java code easier to deal with. Can leave out dots and parens. Pretty easy to work with everything in Java except collections but that is fixed in 2.8. An get hairy when going back and forth. Exception collections there is not a lot of conversion back and forth between java and scala objects.
How do you decide what is done in scala or java?
Scala is for services and Ruby is front end so sometimes have to implement a library in each. Trying to get everything written in scala over time.