Wednesday, April 11, 2012

Announcing go-msgpack


Announcing go-msgpack, a rich msgpack codec for Go. Supports encoding/decoding to msgpack binary format, and use for net/rpc communication.

https://github.com/ugorji/go-msgpack
http://gopkgdoc.appspot.com/pkg/github.com/ugorji/go-msgpack

It provides features similar to encoding packages in the standard library (ie json, xml, gob, etc).

Supports:
  • Standard Marshal/Unmarshal interface.
  • Standard field renaming via tags
  • Encoding from any value (struct, slice, map, primitives, pointers, interface{}, etc)
  • Decoding into a pointer to any non-nil value (struct, slice, map, int, float32, bool, string, etc)
  • Decoding into a nil interface{} (big)
  • Handles time.Time transparently.
  • Provides a Server and Client Codec so msgpack can be used as communication protocol for net/rpc.


Usage:
  // v can be interface{}, int32, bool, map[string]bool, etc
  dec = msgpack.NewDecoder(r, nil)
  err = dec.Decode(&v)

  enc = msgpack.NewEncoder(w, nil)
  err = enc.Encode(v)

  //methods below are convenience methods over functions above.
  data, err = msgpack.Marshal(v, nil)
  err = msgpack.Unmarshal(data, &v, nil)
  //RPC Communication
  conn, err = net.Dial("tcp", "localhost:5555")
  rpcCodec := msgpack.NewRPCCodec(conn)
  client := rpc.NewClientWithCodec(rpcCodec)
  ...

Why?
I was initially looking at different binary serialization formats for an application I'm developing. I looked at Thrift, Protocol Buffers, BSON, Msgpack, etc.

I finally decided on msgpack:
  • compact
  • supports basic types (just like JSON. Numbers, Bool, Null, Bytes/String, Map, List) from which other data structures else can be assembled
  • raw bytes support (which can represent binary data or strings)
  • no schema needed (just like JSON)
  • cross-language support
  • has pretty good mindshare
Unfortunately, the Go library on the msgpack.org site is old, does not build, and is not "smart" like the encoding/json package. Specifically, it doesn't allow me decode into a typed object (e.g. struct, bool, etc).

I wrote go-msgpack to give the same conveniences we've gotten used to (spoiled by using the encoding/json package), while being really performant.

Summary of my testing:
  • Decodes significantly faster (120% faster) and encodes only slightly slower (20% slower) than encoding/json
  • Uses less disk space (40% less)
  • May not require compression (compression only gave a 10% reduction in size).
    Since compression/decompression time may be significant, this may be an important win.
Hope folks use it and enjoy using it. I know I will. Please feel free to send me feedback.

Friday, March 16, 2012

Streamlining Go App Engine Runtime

With App Engine, the Go Runtime is a mashup of Python Runtime, Go SDK and glue code (Go and Python). This poses some challenges during development, which this proposal addresses with solutions.

Current setup of Go Runtime

The current Go Runtime uses:
  • Python Runtime:
    1. RPC/Api server, for all RPC code which should not be handled directly by Go Code (e.g. datastore interaction, tasks, etc)
    2. Frontend Proxy: Requests come into Python Runtime, and are proxied to Go App if it is not a match for an RPC/Api request or a static file. 
    3. Tools: appcfg is used for all interaction with production environment
  • Go SDK:
    1. Go Runtime is a restricted runtime (limited access to some packages like syscall, unsafe, etc, and API's like io.write, etc). A Restricted Go SDK is bundled with the Go Runtime.
  • Glue Code:
    1. Go Runtime provides go-app-builder (command-line tool written in Go) which can build Go App on demand 
    2. Python Glue Code is run on each request, and checks if any file in the app tree has changed. If so, it rebuilds and/or restarts Go App.
Potential Issues with Current Setup

The potential issues with these are:
  1. Too much bundled into Go App
    1. I currently include a symbolic link to my GOPATH in my app
    2. go-app-builder looks for any file in there which has an init(...) method, and includes that in the synthetic main.
    3. Consequently, I have all my GOPATH code in my Go App.
    4. The other option is to either copy packages I use, or selectively add symbolic links to said packages in my app directory. Both of these have issues:
      1. For copying, I always have to ensure multiple directories are in sync, which defeats version-control and ease-of-development
      2. For symlinks, it gets hard when I only want some sub-packages and not others. For example, I have gae, gae/app, gae/db, gae/counter packages. In a simple app, I only need gae and gae/app. But I can't make gae and gae/app symbolic links (because gae has to be a directory for gae/app to be a symlink). So my only option is to include the whole gae directory with all its subpackages.
  2. Testing is a challenge:
    1. I have to use an installed Go SDK (not the bundled SDK in Go Runtime), since the installed Go SDK is very restricted and incomplete.
    2. I have to make symbolic links to appengine and appengine_internal from the bundled SDK goroot directory in my GOPATH, since I have to use an installed Go SDK.
    3. There isn't a nice exported API in appengine_internal which allows us interact with RPC/API server via simple appengine.NewContext(...) just as Go App does.
  3. Documentation is a challenge
    1. godoc from bundled SDK in Go Runtime does not honor GOPATH appropriately, to show all my code
  4. Performance Testing
    1. This is hard because Python Runtime only allows one request at a time, causing false expectation of performance E.g. A page request which should load App Code and 10 images in parallel and take 10ms, will take like 100ms run through Python Runtime.
    2. Concurrent Testing is hard, since requests via Python runtime only go one at a time.
  5. Promote Go development
    1. App Engine is the current poster child for Go Language. It will be nice if the usage experience clearly shows Go at its best:
      1. Encourage installing a Go SDK
      2. Encourage seamlessly using shared Go Code (not ones written specifically for the app)
      3. Encourage using a single godoc instance to see all code:
        In my case, I put all my go code (even app code) as packages in my GOPATH.
      4. Clearly showcase, even in development, Go's benefits (concurrency and performance)
      5. Show how easy it is to build generic Go Apps
  6. No Easy way to build for dev or production
    1. I have some handlers which I want to bind during development (for dev testing), but not during production
    2. I have some code which I want to include and which are used during development, but should not be in production
    3. the // +build mechanism can help here
Proposed Solution

The summary of the solution is to:
  1.  Developers interact with a gaetool executable (written in go) which solves all problems above elegantly.
    1. Hide the Python api_server.py behind the scenes.
    2. Proxies requests to Python api_server, or Go App, or serves static files appropriately (with fine concurrency)
    3. Handles rebuilding/restarting Go App appropriately, vetting app for "restrictions"
    4. Handles assembling full app for uploading to the cloud.
  2. Use an installed Go SDK (as opposed to bundling a restricted Go SDK)
  3. Do not bundle any compiled artifacts in AppEngine SDK (everything in source)
    1. I know this will be controversial, but please hear me out.
    2. This also means that there will be only one Go Runtime SDK (not one per supported platform/arch combination). 
  4. Use a single downloadable AppEngine SDK bundled as a zip, which includes:
    1. Generic Python Runtime (without Python demos, docs, etc)
    2. Source of Go SDK (gaetool, appengine packages, as gosrc/*/*.go files)
    3. Demos, documentation, other complementary files
  5. Export appengine_internal.initAPI(netw, addr string). This way, testing just involves:
    1. Start appengine dev server
    2. call appengine_internal.initAPI(netw, addr string) in init() method of _test.go file.
    3. Use appengine.NewContext(...) as usual in test code.
  6. Support a build context flag for dev-time building i.e. // +build appengine appenginedev
    1. Allows us easily have separate top-level files for development and production respectively.
Tool written in Go (gaetool): developer interaction

The gaetool execution depends on the following:
  1. Build context tags:
    1. // +build appengine (controls inclusion within appengine build)
    2. // +build appenginedev (controls inclusion within appengine dev build)
  2. app.json (instead of app.yaml):
    1. Go has builtin support for json and templates, so can easily read and comprehend a app.json, and spit out a app.yaml (for use by api_server.py and appcfg.py)
    2. This is necessary because gaetool now takes over serving static files and all front-end operations (e.g. ensuring secure-only urls), but api_server.py and appcfg.py need app.yaml
    3. Limits app configuration to what is supported in Go Runtime: 
      • app identity (name, version, api-runtime)
      • static files/directories (with url, mime type, cache expiration) and app resources
      • secure-only urls
      • enabled services (xmpp, email, etc)
      • admin console custom pages
      • custom out-of-app error responses (for over_quota, dos_api_denial, timeout)
  3. Go App Source in the Go App Directory/src.
    1. Go App Directory will be added to GOPATH during a gaetool build
  4. top level package main (in src/main/main.go)
    1. This allows the developer determine the initialization sequence, and what should be included in his Go App (and not all the code which is in GOPATH)
    2. AppEngine synthetic main will just call appengine_internal.Main() or whatever is applicable. 
Wrappers for the gaetool exist as gaetool.sh and gaetool.bat files, and basically do the pseudocode:
if __directory_sdk_extracted_to__/gosrc/gaetool/gaetool[.exe] does not exist
-> cd __directory_sdk_extracted_to__/gosrc/gaetool
-> go build
->  cd -
__directory_sdk_extracted_to__/gosrc/gaetool/gaetool[.exe] [...]

To run dev server or update app, developer will run gaetool as below:

cd __my_app_dir__
__directory_sdk_extracted_to__/gaetool[.sh|.bat] [...]

To do other interaction with the production app engine system (like downloading logs, updating indexes, etc), developer will use appcfg.py as before:
cd __my_app_dir__
__directory_sdk_extracted_to__/appcfg.py [...]

To see full sources, and leverage app engine sources within test code, or godoc, developer will:
  1. Add __directory_sdk_extracted_to__/gosrc to GOPATH
That's the full interaction that the developer has with the Go Runtime SDK. All the magic is hidden behind the gaetool.

Tool written in Go (gaetool): Logic: How it works

The gaetool running as dev server does the following:
  1. Exec api_server.py, so that go app can interact with it
  2. Given the "main" package, it will do like "go list" to find out all the dependencies, and note the corresponding directories for polled tracking (in lieu of a cross-platform fsnotify functionality)
    1. This allows app sources to live outside the app directory, but within GOPATH
  3. It also walks the app directory, and keeps track of the full directory tree
  4. Every 2+ seconds, it looks to see if any directory has changed (last mod modified or deleted).
    1. If any src directory, it will:
      1. reset list of src directories
      2. delete and rebuild packages for modified directories
      3. kill running Go App
      4. rebuild Go App
      5. Restart Go App
    2. If any other app directory, it will 
      1. reset list of app directories
      2. kill running Go App
      3. Restart Go App
  5. Setup http listener on the Go App
  6. Setup http listener on the gaetool, and http proxies to Go App and Api Server, and file server handler for the static files
    1. All http interaction goes through the http listen port on the gaetool, which proxies to Go App or Api Server, or passes request to its file server for static files
An app build now has some intelligence to respect the "restricted" nature of the production app engine runtime. At start of each build, app will inspect the result of "go list" and ensure that no "restricted" APIs (syscall, unsafe, io write, etc) are used. If this inspection passes, then the app is built. Else, the app build fails.

During an update, the gaetool will copy the whole app and all its dependencies into a tmp directory, and call appcfg.py update on it.

gaetool, when shutdown via ctrl-c or a process kill, will also shutdown all the processes it is managing (i.e. api_server python process, Go App process).


Wednesday, March 14, 2012

Trayvon Martin: Unarmed black kid shot for walking in wrong neighbourhood


Trayvon Martin: Unarmed black kid shot for walking in wrong neighbourhood. His killer must be prosecuted.

UPDATE: Mar 19, 2011

It's surprising to me how, only yesterday, after FBI and Justice department decided to wade it, do we get an account of Zimmerman's claim of self-defense. 
http://www.miamiherald.com/2012/03/19/2703029/us-department-of-justice-fbi-and.html "Zimmerman said he had stepped out of his truck to check the name of the street he was on when Trayvon attacked him from behind as he walked back to his truck, police said. He said he feared for his life and fired the semiautomatic handgun he was licensed to carry because he feared for his life.".  
Come on ... this doesn't even add up with the 911 recording which the police have. And they let him go? In the 911 recording, he gets out of his car and goes after Trayvon, and ends up killing him in someone's backyard. This doesn't even sound like "stopping at a street corner to check the street name". The Sanford police seems extremely complicit in a cover-up, when an innocent life was taken.  
Man, I love Miami ... but FL is looking real scary now.
I hope many of you have read about the case of a lil young black kid (Trayvon Martin: 140 lbs, 17 years old, unarmed, with a bag of skittles and a can of soda) who was shot in the chest by George Zimmerman (a 240lb,  28 year old, armed with a gun), for acting suspiciously in his neighbourhood. The victim's crime was that he was walking in a community where he was not expected to be. George followed Trayvon in his car, called cops about suspicious person in his neighbourhood, cops asked him to stand down and they would dispatch an officer, he didn't and confonted Trayvon, a confrontation ensued, ending in Trayvon being shot in the chest. George claims self-defense, against a kid 11 years younger and  100 pounds lighter with only a bag of skittles and a can of soda, even though George was the armed aggressor who followed a pedestrain in his car?


I've been in contact with the police department via email, and the police chief, Bill Lee has responded with the standard "we know more in the investigation, and I understand your concerns".

I think it's a travesty that, 2 weeks later, the killer has not been arrested. The cops even "allegedly" coached the killer in his statement by asking leading questions, and had witnesses changing their story.

Even after 2 weeks, the cops cannot say why this is a case of self-defense, just responding that "the killer has made a statement of self-defense, and we have to go with that unless we have proof otherwise".

There's so many things wrong with this story, but as usual, please research it yourself and if you feel that you should, sign the petition on change.org. It's short of the goal of 35,000 signatures (accrued about 28,000 now). Also, share this with your social network, and let's hopefully stop something like this from happening ever again. We all have to start putting more value on human life.

References:



Saturday, December 3, 2011

Dev Tool for GO AppEngine

Development Tool for go app engine development, that presents an easier to use wrapper for App Engine development with GO Runtime, bypassing some pitfalls caused by integration with the Python SDK.

The source is available online, and the motivation for building this is described below. What irks one person may not irk the other, so your utility of this tool may differ from mine. For me, the utility is really high:
  • Requests are much faster
  • I get a better appreciation of how fast my GO App is
  • I can see how many concurrent requests my app can handle
  • I can now do load tests which really stress out my GO App, giving me finer results and seeing how it handles resources

Get Source

Source code for the tool is available at: http://code.google.com/p/go-gae-dev/

To clone, build and test it out:
    mkdir -p /tmp/go-gae-dev
    cd /tmp/go-gae-dev
    git clone https://code.google.com/p/go-gae-dev/ .
    export GOPATH=`pwd`
    cd src/gogaedev
    goinstall -nuke .
    ./gogaedev

Read on to see how to test it out on your own setup.

Background

Currently, the GO App Engine SDK bundles the Python App Engine SDK, and uses it as a http proxy and an API server. This means that all request come into the Python SDK first. 

The Python SDK does the following on behalf of the GO SDK:
  • If the Python SDK determines that it is a static file request, it serves it directly from the filesystem
  • If the Python SDK determines that it is an internal appengine request, it serves it directly
  • Else it proxies it over to the GoApp
    • If the GoApp is not running, it starts it
    • If any file has changed since the last request was initiated, it will rebuild and/or restart the GoApp
    • It then proxies the request over to the GoApp, and returns the response back to the client
The problems with these are:
  1. Python SDK is single-threaded. It handles one request at a time. Consequently:
    1. GO App development cannot support concurrent requests 
    2. For each request to GO App, all files in the directory are checked to see if the app should be reloaded
    3. Web Requests are slower since each static file must be loaded one at a time.
  2. Python SDK checks all files for each request, making things slower. 
  3. GoApp Log files are interspersed with the Python SDK log, and it's hard to separate log files for each run.
  4. App.yaml is used for both development and production:
    1. In itself, this is not a problem
    2. However, in practice, it can be. For example, you may want to use a different initialization file in development than in production. You may want to skip some files (tests, dev setup) in production, but use them in development. (Python SDK even has the allow_skipped_files flag for this). 
  5. Performance Profiling
    1. The Python SDK frontend prevents me from really seeing how well my GO App performs.
    2. I cannot do load tests, or stress the server, or see how concurrent requests may cause things to fail because I'm sharing memory where I should be using some of the concurrency primitives effectively.
The beauty of GO is that all these features are supported by GO libraries in very elegant ways, and we do not have to put up with *all* the Python SDK limitations if all we need Python SDK for is to serve as an RPC server for AppEngine services (ie datastore, memcache, etc), while leaving Go App to do the same thing it would be doing in a production environment.

There's a saying that: "If there's an itch I really want to scratch and I don't, ... I eventually will, ... just so I can move on". It's paraphrased, but I think Brad Fitzpatrick eloquently talks to it here:  http://bradfitz.com/talks/2011-09-Djangocon . Anyhow, I finally gave it and wrote a tool that does everything, relegating python to just serve as an RPC Server and serve admin console, and other /_ah/* requests.

What the tool does

This tool will do the following:
  1. Launch Api Server (Python SDK)
    if nothing is listening on the Python API Unix Socket
  2. Launch go app (just like python sdk does) and writes log files to a certain directory
    if nothing is listening on the Go App Unix Socket
  3. Create a proxy to do the following (by default):
    • requests matching / or /_ah/warmup, etc go to GO
    • requests matching /_ah/*, /form* go to Python
    • requests matching static files are served by this "helper" process
    • all others go to GO
  4. Watches for changes to directories for my app
    • If any .go file there changes, it will rebuild and restart the app
    • If any other file changes, it will just restart the app
  5. Watches for some files and keeps them in sync.
    • If any source file changes, it will copy it over to the corresponding dest


Setup of Python SDK:

With this setup, only requirement from Python SDK is that it should initialize its API server socket at startup, not at first GoApp CGI request. 
To setup the API server accordingly, Two files have to be edited:
  1. google/appengine/tools/dev_appserver_main.py
    to create and listen on the Go API Server Socket at startup
  2. google/appengine/tools/dev_appserver.py
    to remove (comment out) call to execute_go_cgi
Edit google/appengine/tools/dev_appserver_main.py:  
to create and listen on the Go API Server Socket at startup 
(i.e. before call to http_server.serve_forever).
    #ugorji: add call to setup api hook port
    if appinfo.runtime == 'go':
      import threading, getpass, atexit, asyncore
      import google.appengine.ext.go as go
      from google.appengine.ext.remote_api import handler
      user_port = '%s_%s' % (getpass.getuser(), port)
      go.SOCKET_API = go.SOCKET_API % user_port
      go.SOCKET_HTTP = go.SOCKET_HTTP % user_port
      go.GAB_WORK_DIR = go.gab_work_dir() % user_port
      go.cleanup()
      atexit.register(go.cleanup)
      go.RAPI_HANDLER = handler.ApiCallHandler()
      ds = go.DelegateServer()
      def asynCoreLoop():
        while ds.connected or ds.accepting:
          asyncore.loop(map=ds._map, count=1)
      th = threading.Thread(target=asynCoreLoop)
      th.setDaemon(True)
      th.start()
Edit google/appengine/tools/dev_appserver.py 
to remove call to execute_go_cgi, by commenting out the whole code block. 
This way, Python SDK does not try to proxy requests over to the Go App.
  # Ugorji: remove hook for _go_app
  # if handler_path == '_go_app':
  #   from google.appengine.ext.go import execute_go_cgi
  #   return execute_go_cgi(root_path, handler_path, cgi_path,
  #       env, infile, outfile)

How To Use This Tool:
-    User creates a file go-gae-dev-cfg.json, and puts in their app directory. 
     A complete one looks something like this (below, we show what the defaults are):
       {
         "Verbose": false,
         "Succinct": true,
         "UseFSWatch": false,
         "IncludeChildProcLogs": false,
         "GaeSdkDir": "",
         "AppId": "app",
         "AppVersion": "1",
         "GoFilesToIgnore": "abc",
         "WatchPathsToIgnore": "(.* /)?(_go_\\.[0-9]|_obj|.*[#~].*)",
         "WatchDirNamesToSkip": "^(cmds|_obj|\\..+|_.*)$",
         "StaticPaths": "/web(/.*)?|(.*\\.(gif|png|jpg|jpeg|ico|css|js|json))",
         "InitialCheckGoPaths": "/$|(/_ah/(warmup)(/.*)?)",
         "ApiPaths": "/_ah/.*|/form.*",
         "ApiServerHttpURL": "http://localhost:8080",
         "GoServerHttpURL": "http://localhost:8088",
         "ManageGoApp": true,
         "ManageApiServer": true,
         "ProxyAddr": ":8888",
         "ApiParams": [ "--allow_skipped_files", "--skip_sdk_update_check" ],
         "LogDir": "/tmp/gogaedev_USERNAME_logs",
         "StaticFilesDir": ".", 
         "ManageGoAppIntervalSecs": 5,
         "FilesToSync": { },
         "DirsToWatch": [
           "."
         ]  
       }
     A lot of these entries are reasonable/adequate defaults and can be omitted. 
     At a minimum, configure these:
       {
         "GaeSdkDir": "/opt/go-appengine-1.6.0/google_appengine",
         "GoFilesToIgnore": ".* /((main|yaml)/.*|app_(dev|prod)\\.go)"
       }
     To sync some files, e.g. if app_dev.go changes, copy it over to app_for_env.go, use:
       {
         "FilesToSync": {
           "app/server/app_dev.go": "app/server/app_for_env.go"
         }
       }

-    User must add a http tcp listen address to their app. A quick way to do this is, 
     within an init() method in your app, add:
       if appengine.IsDevAppServer() { go http.ListenAndServe(":8888", nil) } 
-    Within your app directory, run gogaedev:
       gogaedev
-    Access application as usual, but using the proxy (instead of going through Python SDK). 
       E.g.
       http://localhost:8888/_ah/admin
       http://localhost:8888/


Wednesday, November 16, 2011

GO App Engine datastore operations design


GO App Engine datastore.Load/Save uses goroutines and channels to iterate over datastore entity properties, causing overhead.

Background
With GAE 1.6.0, Support for Indexed Properties, Hooks, etc was introduced with a nice, elegant design using a PropertyLoadSaver interface that uses channels (as an iterator).

I noticed that, after updating my code to utilize the PropertyList, some of my application requests started taking about double the time they were taking before. Previously, with datastore.Map, my requests still took roughly same amount of time.

On digging further, I found the following in the implementation:
    appengine/datastore/load.go
      func loadEntity(dst interface{}, src *pb.EntityProto) ...
          c := make(chan Property, 32)
          errc := make(chan os.Error, 1)
          go protoToProperties(c, errc, src)
    appengine/datastore/save.go
      func saveEntity(defaultAppID string, key *Key, src interface{}) ...
          c := make(chan Property, 32)
          donec := make(chan struct{})
          go func() { ... }

That is, For each entity (analogous to each row in a table), we create and use:
     1 goroutine and 2 channels.

The deprecated datastore.Map retrieval bypasses this Channel/Goroutine dance, which is why my response time did not change until I switched to datastore.PropertyList.

Concerns:
- For each Get/Save request, which load/save n entities, n goroutines and 2*n channels are started.
  This is analogous to starting 1 goroutine and 2 channels for each row in sql return set.
    For example, One API RPC call that returns 100 entities will cause 100 goroutines spawned and 200 channels created just for this one API call.
- However, the requests are still serialized
  (ie one load/save conversion is performed before the other).
  So we don't gain any parallelism but pay a large cost.
- This is a significant overhead for a simple iteration where concurrency is not a goal.
- There's still significant allocation (which is what we were trying to avoid).
  - each "row" or entity causes a new goroutine and channel to be returned.
  - each channel (for iterating properties) has a buffer size of 32.
  There ends up potentially being more allocation than if we just returned []Property.
- Implementation detail bleeding into the API, making it harder to optimize later.
  By typing API to channels, we cannot optimize out of this later on.
- Also, we seem to be using the "channels as iterator" anti-pattern, which is frowned upon, especially by the GO team. See:
  Russ Cox: http://code.google.com/p/go/source/detail?r=ed32ab5693
  Russ Cox: https://groups.google.com/d/msg/golang-nuts/jb4YfdFwmmM/P-55mxV0a8oJ
  David Symonds: https://groups.google.com/d/msg/golang-nuts/bcAWzaSYC0Y/nk-b5fUR_loJ
  http://stackoverflow.com/questions/5033605/common-programming-mistakes-for-go-developers-to-avoid/5034195#5034195

Can we do without the goroutines/channels, especially in the API? This way, we can use different implementations.

Alternative solution using iterators
An alternative, equally elegant solution would just use iterators:
- type PropertyIterator interface:
  Next() (Property, os.Error) //To signal end, os.EOF/datastore.Done is returned
- type PropertyLoadSaver interface:
  PropertyLoad(PropertyIterator p) os.Error
  PropertySave() PropertyIterator, os.Error

For implementations of PropertyIterator:
- type PropertyIteratorFunc func() (Property, os.Error)
  Next() (Property, os.Error) //calls itself
- type PropertyList []Property:
  Iterator() PropertyIterator //actually a PropertyIteratorFunc
- type ChanPropertyIterator chan Property: //implements PropertyIterator itself
  Next() (Property, os.Error) //does a <- on channel, and returns os.EOF/datastore.Done appropriately
  //optional: for people who prefer goroutine/channel alloc over slice alloc)
  //this would be like the current solution today, but not exposed in the API)

Since GO Runtime is still experimental, making a contained API change should be ok.

But RPC dominates the overhead per request. Why focus on goroutines/channels use?

Definitely, the RPC time will dominate the overhead from a goroutine and 2 channels. However, we're talking about potentially 100's or 1000's of goroutines per request (equal to the number of "row" returned by, or sent to the API call). E.g. for a GET that returns 100 entities, thats 100 goroutines and 200 channels created to service that 1 API call. And these goroutines/channels we're making have nothing to do with concurrency: we're just using this for iterators.

Also, within our application code, we still have to optimize our code (and especially our exported APIs), even though we know that RPC overhead will overshadow it.

Main Concern: Implementation bleeds into the API
My main concern is that this bleeds into the API. By using Iterators, you can use channels and a goroutine in the implementation, and change that afterwards, without application users having to know about it.

The alternative implementation proposed above shows how thic can be done using iterators. It's trivial to implement (in GO code) and you can gain what you want, without restricting your implementation:
- Objects don't need to exist longer than it needs to populate the fields
- Intermediate state is supported
- No need to pass around []Property for a large entity

However, the API is not tied to an implementation, so you can implement with goroutines/channels, or with a List. User code that passes a PropertyLoadSaver can use whatever is most applicable/optimized for his usecase. For example, in my user code, I can pass PropertyList into each call and will not incur the overhead of goroutines/channels.

Have others solved similar problems using goroutines/channels? Where?
It seems that the use of goroutines/channels as iterators is not done in other *similar* places:
- See datastore.Query whose iteration doesn't expose goroutines/channels
- See exp/sql/driver whose iteration doesn't expose goroutines/channels (just a Next([]interface{}) method)

What is the performance overhead (load on CPU, RAM) with this? Does it scale?

Initially, when I did this, I ran some rudimentary tests to find the maximum number of goroutines I could create on my machine and how much resources it took.

The summary of the results is that, On a 2.0GHz core, I could start a maximum of 5e5 (500,000) goroutines which basically did nothing (beyond that, I got errors). The RAM usage was 2.0GB.

An app engine instance is 600MHz single core with 128MB limit. That's about a 1/4 the CPU and 1/20 the memory. (Even my nexus one has way more resources than that.)

In summary, 2.0GHz, 2GB RAM produced 500,000 goroutines max. I wonder how many a 600MHz, 128MB app engine instance would accomodate.

I'd suspect a few thousand goroutines on such a tiny "computer" (600MHz, 128MB) would tax the system. However, it's really easy to get into such a situation with the current design. If most of the time is spent on RPC (I/O) and CPU load is low, GO can easily support a large number of concurrent requests. 50 concurrent requests each retrieving 200 entities will mean 10,000 goroutines (+20000 channels) at the same time, just serving API requests, and imposed by the SDK runtime (ie not application code which we can control or tune). In this scenario, the runtime is imposing an overhead which does not seem necessary.

If we expect that most people will pass a PropertyList to calls to GetXXX or PutXXX, then the goroutine/channel is completely redundant.

Also, remember that each goroutine allocates an initial stack of 4K, so each goroutine has a cost in memory allocation, which becomes non-trivial under load.

The rudimentary go code used to run this test is available at:
Shared online: You can download a go file, to compile and run on your computer here.
- On Golang Play: You need to run this on your local computer

Monday, November 14, 2011

Testing Go App Engine Applications natively



With changes to allow concurrent requests in Go App Engine, Testing support follows naturally and natively.

Following support for concurrent requests described previously, Testing support is as easy as ensuring the following is called one time before your test is run. I have tested it and it works flawlessly.

(I call this once using sync.Once.Do(...) within init method)
//don't conflict with http socket that devserver's go_app uses (switch ugorji with your username)
 os.Remove("unix:/tmp/dev_appserver_ugorji_8080_socket_http_2")
 flag.Set("addr_http", "unix:/tmp/dev_appserver_ugorji_8080_socket_http_2")
 flag.Set("addr_api", "unix:/tmp/dev_appserver_ugorji_8080_socket_api")
 go appengine_internal.Main()
 time.Sleep(1e9)

Once this is done, and you have a Python Dev Server running, then all your normal calls work.

To create a context and use it:
 req, _ := http.NewRequest("GET", "/", nil)
 req.Header.Set("X-AppEngine-Inbound-AppId", "dev~app")
 ctx := appengine.NewContext(req)
We need to do the dance of setting flags and stuff because the appengine_internal.Main is what is called by your app's main() method. It uses parameters passed on the command line, and it internally will start a server socket for http (which is why we have to run it in a goroutine). We have to use this function because it is exported.

We really only need the initAPI function (which would be a 1-line call to make testing seamless).
    appengine_internal.InitAPI("unix", "/tmp/dev_appserver_ugorji_8080_socket_api")

To make this easier, it would be nice if the appengine_internal.initAPI function is exported:
    appengine_internal.InitAPI(netw, addr string)

Enable Concurrent Requests in Go App Engine SDK


This details how to enable concurrent requests in the Go App Engine SDK.

UPDATES:
Nov 15:
Added that python sdk is currently not threadsafe. This shows how to make GO side threadsafe, and still test concurrency in your application (even though only 1 API request is processed at a time).


Background

The GO App Engine SDK has a pretty elegant design which I wished the Java App Engine SDK had. Full SDK with support for app engine services is supported one time (via Python), and new language runtime (like Go) can be introduced quickly, leveraging that investment (as opposed to duplicating it). Brilliant.

It also simulates what happens in production to an extent, where there's an App Engine instance that runs your application, but uses RPC (remote procedure calls) to access services provided by App Engine.

In this setup, the Python SDK which supports all the App Engine Runtime services acts as two things:
  • A front end.
    Non-app requests are handled by the Python SDK front-end, and app requests get proxied over to the Go Application Instance.
  • RPC Server.
    All App Engine services reside on the Python SDK. The Go Application uses RPC to access those services.

Getting everything to work is pretty neat.
  • The Python Dev Server creates a Go Instance as a child process
  • The Go Instance creates a Server Socket which the Python SDK uses to proxy http requests to it
  • The Python Dev Server creates a Server Socket which the Go Instance uses to send API requests to it
  • Only one request happens at a time, as detailed below.

When a request comes through the Python SDK for the Go App, the following happens:
  • The Python SDK creates a socket to Go Instance and sends the http request to it
  • The Go Instance handles the request.
    For any API calls, it makes a socket connection to the Python SDK API, and sends the request and receives the API response back.
  • The Go Instance sends the response back to the Python SDK
  • The Python SDK forwards the response to the client

Currently, the design has some limitations that allow only one request be handled at a time:
  • This implementation uses CGI
  • Handling socket communication only occurs within the context of a request i.e. the sockets are not listened to unless a single request is in process

Objective:

The objective here is to support concurrent request. This can be done by making the Python SDK a full proxy, with standalone support as an API RPC Server (outside the context of a request).

This will allow more involved testing scenarios:
  • Have tests running directly on the GO Server within the regular context of a request (including common work done before and after a handler is called)
  • Have tests using the Python SDK server directly for API calls

To summarize, these are the things we hope to achieve:
  • Let the Python SDK be a true and full proxy to Go Instance, allowing concurrent requests be proxied and handled.
  • Honor allow_skipped_files flag (to allow skipped files e.g. test files, etc)
    Allowing skipped files in development is very necessary for tests, pre-building, etc.
  • Support testing framework, which can access the Python SDK as an API server without going through a request.
    This way, testing can involve just starting a Python Dev Server (even if no http request happens).
To achieve this, the following changes are necessary:
  • Use the Python 2.7 SDK which allows for concurrent requests
  • Use WSGI (as opposed to CGI) which allows for concurrent requests 
  • Have API socket listening and handling be always-on (not only when a http request is in process).
    Use a thread to listen to and respond to all API socket communication (listening and handling)
  • Have a setup/init function that is run when the Python SDK is started for a GO Runtime, as opposed to a one-time run when a http request happens
This support is got by minor edits to 2 files, a more involved edit to 1 file, and an one-line change in your app.yaml:
  1. google/appengine/ext/go/__init__.py
  2. google/appengine/tools/dev_appserver_main.py (minor edit)
  3. google/appengine/tools/dev_appserver.py (minor edit)
  4. app.yaml (to reference the WSGI app instead of _go_app)
I've shared a folder containing all of the changed files online here. Feel free to download the changed files and follow through. For all changes, look for the name "ugorji" in a comment in the file before each change.

But Python SDK does *NOT* support concurrent requests
Yes, Even with these changes, requests to the Python SDK are still inherently single threaded:
  • dev_appserver...serve_forever() will handle one request at a time
  • dev_appserver is not thread safe. In the midst of multiple threads handling requests, it get datastore collisions and barfs
Thus, these changes will make the GO side run concurrently. A user can add a back-door http listen port and access the GO instance directly. I do this within an init() or sync.Once.Do(...) surrounded by an if appengine.IsDevAppServer() { ... }

    http.HandleFunc("/", ...)
    http.ListenAndServe(":9999", nil)

Also, within your top-level request handling code, do a check to ensure the header for contexts is set. This is necessary because the Python SDK will add this to the headers proxied to your application. Bypassing the python proxy requires that at a minimum, you set this yourself (before creating any appengine.Context).

    if r.Header.Get("X-AppEngine-Inbound-AppId") == "" {
        r.Header.Set("X-AppEngine-Inbound-AppId", "dev~app")
    }

After that, you can make requests at http://localhost:9999 and get access to your application. Requests through this url can run concurrently. Access to API's will still be serial (one at a time) but you can still test concurrency in general for your application. This way, only API requests block but the everything else runs concurrently. 

When Python SDK becomes thread safe, we only need to make a few changes to be compliant.

  1. On Go AppEngine end, update the following:
    1. appengine_internal.InitAPI:
      just store the network address for the API server
    2. appengine_internal.call:
      open/close a connection to API server for each request
  2. On Python SDK extension:
    1. ext/go/__init__.py:
      change DelegateServer to listen(n) where n is number of concurrent requests supported e.g. listen(10)



Tuesday, September 27, 2011

Datastore Enhancement for GO Language Runtime in Google App Engine


This attempts to make a case for datastore enhancements in the GO Language Runtime of App Engine.

Quoting from http://code.google.com/appengine/:
Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast development and deployment; simple administration, with no need to worry about hardware, patches or backups; and effortless scalability.
And paraphrasing from http://golang.org/
Go is an expressive, concise, clean, and efficient, with sophisticated concurrency primitives and a novel type system which enables flexible and modular program construction. It's a fast, statically typed, compiled language that feels like a dynamically typed, interpreted language, complete with garbage collection and run-time reflection. GO is the bees-knees.
I spent the last year building an app on the Java Runtime, waiting on the GO Language Runtime to hopefully become available. It became available in July, and I started working with it.

It is a much better fit for new application development IMHO. Among the many reasons are:
  1. I had to worry less about creating abstractions and solutions, and could just focus on the task at hand. The natural primitives and modern bundled API's make me less reliant on 3rd party code or custom built solutions.
  2. It is conceivably more performant that the Java Runtime (since it uses GO-Routines which could scale more than just pure OS threads). I look at the way things like NodeJS scale, and thing GO's model could get me closer to that.
  3. The runtime is leaner and meaner (more CPU space, more RAM space, less runtime overhead ==> more requests handled per instance)
  4. It has all the modern features I really need: closures, first-class functions, extensive type system, conversions, clean syntax, and simple yet sophisticated builtins for concurrency and messaging.
  5. Programming is WAY more fun and productive (it's not even close) 

GO is truly an extremely delightful language, and I am very excited to hop on it.

There are some features I have built over the datastore on the Java Runtime, which also exist bundled on the Python runtime, and which I have come to depend on. I'd enumerate them, and then try to explain and make a case for them. I would love for the App Engine GO Team to discuss these, and see if they could/would be implemented natively in the GO SDK.

Datastore Features

The features are enumerated below:
  1. "Optional" Integrated caching: L1 (request-scoped, in-process) and L2 (Memcache)
  2. Embedded Types (stored as . separated columns)
    Include Support storing maps of primitive to primitives (2 columns: fieldName and fieldName_)
  3. Alternate Datastore Column Names for fields
  4. Callbacks: preSave, postLoad. Allow app reject a load/save request also
  5. Functions for decisions: store/index this property? 
  6. Polymorphic Queries 
I'd try to describe and make a case for each one below.

1. "Optional" Integrated caching: L1 (request-scoped, in-process) and L2 (Memcache)

Caching is now much more important with the new billing structure. It would be nice if it was transparent, in such a way that the SDK "could" check caches (request-scoped in-process, and longer-lived memcache) before checking the datastore, for GET's and also for PUT's and DELETE's. Folks could "configure" L1 and/or L2 caches for specific structs, and just depend on the SDK API's to do it transparently.

Without the SDK providing it, almost everyone will create a custom solution, which would end up being wrapper functions around most of those SDK API's.

Caching will bypass queries and transactional GETs, and clear entries during transactional PUTs/DELETEs.

2. Embedded Types (stored as . separated columns)

Imagine a struct like:
  type A struct { A1 int, A2 int, B1 B }
  type B struct { Ball1 int, Ball2 bool }
  var A A

Imagine you want to query on A.B1.Ball. It will be nice to store the columns with dot separated keys. ie an entity A could have columns: A1, A2, B1.Ball1, B1.Ball2.

The solution will also support a slice of B. Suppose we have:
  type A struct { A1 int, A2 int, B1 []B }.
The columns are then stored as slices ie the column types for A1, A2, B1.Ball1, B1.Ball2 would be int, int, []int, []bool respectively.

The embedded-types structure stored is only applicable where there's a need to index and query on them. Where there's no need to index or query, they could be stored easily as blobs (using gob encoding).

In addition, it would be nice to store maps of primitives to primitives with support for querying. For example:
  type A struct { A1 int, A2 int, B map[string]bool }
We should be able to search for where A1 is 1, A2 is 5, and the map has a mapping for 'goog'.

3. Alternate Datastore Column Names for fields

With App Engine, the column names are replicated many many times during the storage of a single entity (during each index write, during entity write, during each composite index permutation write, etc). This can add significantly to the cost of storage. It was standard practice to define short alternate names for the datastore to use.

At the minimum, the SDK should respect this configuration during reads and writes. As a bonus, it can also respect it during queries (so that all code will just use the field names, not the configured alternate names).

4. Callbacks: preSave, postLoad. Allow app reject a load/save request also

The SDK should call methods on structs just after loading, and before saving. These methods should return an error which should signal to the SDK that it should not proceed completely with these entities. During a save, those entities should not be saved and should be returned in a MultiError. During a load, those should not be returned normally, but as part of a MultiError.

These allows some things:
  1. Modify entities after loading to compute extra properties which are not stored but evaluated at runtime (e.g. loadTime, isLoadedFromDatastore, etc)
  2. Modify entities before saving to reset some properties (e.g. lastModifiedTime, etc)
  3. Inform the SDK that a certain entity is dirty and should not be stored in the datastore
  4. Signal that a certain entity is dirty and should be treated carefully after loading

5. Functions for decisions: store/index this property?

Storage is expensive.

Indexing makes it proportionally more expensive. The number of write operations in each entity store is 1 + 2X number indexes. A single entity update with 4 properties could be 1 write operation or 25 write operations (if all 4 properties are indexed asc and desc including delete and put index operations, and you have two composite indexes).

To alleviate this:

  1. Some entities should not be stored if they have errors.
  2. Some properties in them should not be indexed if we're never going to query on those values.

For example, we may only need to query on a boolean property if it is true. Or only need to query on a numeric property if it is over a threshold (e.g. 100). In these scenarios, we only want to index the property if its value is true or >100 respectively. A simple true|false configuration will not suffice. We need to evaluate the decision at runtime just before the entity is stored.

The callbacks defined above could help here.

6. Polymorphic Queries

Polymorphic storage allow you to store different structs in the same datastore kind, but use a discriminator column to find and load the appropriate struct. This simulates efficient JOINS (since it's really just different entities stored in the same table). A solution like this is also natural for a datastore like BigTable (since all entities must not be uniform).

With the general model of the SDK (where a struct is passed to the GET request), it is a bit less natural to implement. A possible solution may involve passing in a nil interface{}, so that the API would look for a discriminator column and determine the type of struct to return. (The JSON package did something like this in its Marshal method, where passing in a nil interface{} allowed it create and return a diff object).

UPDATES

The App Engine team informed me (via the groups discussion at https://groups.google.com/d/msg/google-appengine-go/b8gkgnN0L1Q/q95QC4mD6zcJ that they already had two of them on the short-term roadmap:

  1. Alternate Datastore Column Names for fields
  2. Functions for decisions: store/index this property?

These are two of the more important ones. The others can be worked around using application-defined convention and wrapper methods that expect the convention. I've built support for all the other 4 features for my application in about 400 lines of GO code, which doesn't duplicate but builds upon and depends on support provided by the SDK (including the 2 features on the roadmap). This shows that the team made the right decision in picking these 2 to support from the jump.

"If you're having code problems, I feel bad for you son. I got 99 problems but JAVA ain't one ... GO!!!"

Thursday, September 8, 2011

Objective Gripes with New Google App Engine Pricing


This attempts to objectively address areas where the new App Engine pricing may not not fair, and what Google may do to alleviate these concerns.

Disclaimer
I am an unabashed fan of Google App Engine, and have been for over three years. I don't think there's anything else on the market that comes close. I think it is a fine platform for any applications, from simpler small ones to large complex ones.

My only gripes are about the new pricing costs, how it is being rolled out, and some lack of transparency. These gripes are founded because they could cause the platform to lose its momentum, leading to an untimely death which is what they seem to be trying to avoid in the first place.

Hypothesis on making App Engine a "sustainable" business
Google said that the pricing model change was necessary to make App Engine a sustainable business. Many people took that to mean that App Engine may have been running at a loss. I interpret that differently.

My guess is not that they were making a loss, but that they were not generating enough revenue/profit. For a company Google's size, and with it's new focus on high-value products, I believe each product had to have a material impact on the bottom line to survive. With the current user base and trajectory, the profits was not enough to make a material impact to the bottom line. Consequently, Google decided to hike up the prices while giving us short notice before implementing them, even though their own ducks were not in order (buggy, non tested scheduler, no concurrency in Python, etc).

Pricing Gripes
Google App Engine provides a Platform-As-A-Service system, with the following:
  • Data Hosting:
    We leverage IP Google has built for high performance, highly available and scalable storage. For these, we are charged seperate premium costs for datastore storage and per operation.
  • Application Hosting:
    We leverage Google's hosting and routing infrastructure at the frontend and optionally backend. This comes with some value-added services, mainly:
    1.   Disaster recovery (multiple data centers)
    2.   Smart Routing (horizontal scaling, activity-based routing)
    3.   Distributed Caching (with non-deterministic eviction policy)
    4.   Scheduled Tasks 
    5.   API's to use their services 
  • Other Services:
    A few other premium-priced services which are optional to use (email, XMPP, channels, etc)

The key thing to note is that these are all charged separately. Discussing the cost of one of these should not be mixed up with value provided by the other services.

There are two big pricing changes which will affect users of the service in a way that is not deterministic:
  1. Cost of hosting application instances (front end and backend).
  2. Cost of datastore operations

Cost of hosting application instances (front end and backend)
Beyond the free quote, the application hosting is priced at about $60 ($0.08 X 24 X 30) if you "average" 1 instances of 600Mhz CPU, 128MB RAM, and increases by $60 for each new instance per month.

Contrast this with other providers e.g. http://order.1and1.com/xml/order/ServerPremium where you get a 4.4 GHz CPU with 2G RAM for the same price. That's about 7X CPU and 15X RAM for the same price.

Given that the other services provided by App Engine are charged separately at a premium and there are tens of thousands of billing-enabled applications, this margin is a pretty hefty multiplier for each application (even taking the value-added features into consideration).

Beyond the fairness of the hosting cost, there is also the concern that some things are not deterministic and could cause our charges to balloon. These include:
  • Scheduler algorithms (which may cause unnecessary instances to be created)

Cost of datastore operations
We can't really comment on the fairness of the datastore prices until we've had a chance to see it for a while. These prices may be fair regardless, and developers will just have to go back to the drawing board to update their applications and re-optimize.

However, there are also things which are not deterministic and could cause our charges to balloon. These include:
  • Memcache eviction policy (which could result in further datastore access)

What Google may do to help alleviate concerns:
Google may do the following to alleviate our immediate concerns:
  1. Make the hosting prices fairer (ie reduce the hosting prices to a more palatable multiplier) for both frontend and backend instances.
  2. Implement some fair and transparent scheduler algorithm:
    This allows users direct the scheduler to use as few instances as technically possible and only sparingly start up extra instances. The following should be inputs to the scheduler mechanism:
    1. Request Latency
    2. Server Resource Consumption (CPU, RAM).
      This is more critical on a runtime like GO which (in theory) is not as limited by the number of available threads, and so can handle a lot more concurrent requests.
    3. Min Idle Instances:
      Allow and respect a min-idle-instances to be settable to 0. Also prefer that setting over the max-idle-instances (which could be 2).
    4. Maximum Number of concurrent requests per instance
  3. Support Throttling of requests.
    If I hit my reach my threshold of max-budget and/or max-active-instances, start throttling requests instead of the options of shutting down for the rest of the day or having an extremely high bill. This will prevent me from getting slashdotted to broke, or having my site completely down.
  4. Implement a fairer and transparent memcache eviction policy:
    For example, evict from Memcache for larger users first. If there is still memory pressure, then for all applications using over a threshold, evict a certain percentage of their memcache, and continue doing that until memory usage is below the machine-level threshold.
  5. Make Python 2.7 Generally Available at least one month in advance of leaving Preview:
    This allows Python users have concurrency support. The current temporary half-price suggests that an instance can only handle 2 concurrent requests which is not enough.
In addition, these may help moving forward:
  1. Give us a better roadmap.
    With premium pricing should more premium treatment. Many of the features on the roadmap have been there for over a year. Users have no insight into when these features may come. A roadmap should allow users plan accordingly. For example, I might wait on building a feature if I know that Google plans to release it within the next 3-6 months.
  2. Let us know what experimental means.
    Currently, GO runtime is defined as experimental. What does this mean? Does this mean that the plug could be pulled on the GO runtime at any time, or will we be given a heads up and how long (e.g. 1 year, 3 years, etc)?
  3. Create a better solution for downloading your application's data.
    For example, allow users request their data and charge them a fee comprising of a flat admin fee, and a variable cost based on the size of the data. Then ship the encrypted data to the user in a set of portable media (DVD's, portable hard drive, etc).

Responses to critical responses to our outcry
Users outcry has got some critical responses from people more sympathetic to Google with regards to their new app engine pricing. Let's respond to them.
  1. Google is not a charity:
    No one expects Google to give this out free. Google is also free to decide to start charging us more for the significant IP they've built over the years even before App Engine was conceived. Our gripe is that the charge of hosting in isolation is very high for the server resources we are being given.
  2. App Engine was in preview before.
    Forgive me, but after 3 years with no idea that the price may change, there's no way we could have seen this coming. Google even introduced pricing of some things that were in-line. For example, Always On was released in December, and gave us 3 Always-On instances for $9 less than a year ago and over 2 years into the preview. Contrast that now with 1 instance for $56. It's not an apples-to-apples comparison, but my point is still made.

    An argument can even be made that App Engine exited Preview in February 2009 when billing was enabled. Look at the first blog post from http://googleappengine.blogspot.com/2008/04/introducing-google-app-engine-our-new.html, and I quote: 
    "During this preview period, applications are limited to 500MB of storage, 200M megacycles of CPU per day, and 10GB bandwidth per day. We expect most applications will be able to serve around 5 million pageviews per month. In the future, these limited quotas will remain free, and developers will be able to purchase additional resources as needed."
    In summary, Google had over 8 months to monitor usage of early adopters, with over 100,000 applications, before releasing a billing plan which at the time was considered fair. It also introduced new pricing within the last year which was in line with its typical low pricing. We were not even aware that App Engine was still in preview.


References:
 http://googleappengine.blogspot.com/2010/12/happy-holidays-from-app-engine-team-140.html
 http://code.google.com/appengine/docs/roadmap.html
 http://googleappengine.blogspot.com/2008/04/introducing-google-app-engine-our-new.html
 http://googleappengine.blogspot.com/2009/02/new-grow-your-app-beyond-free-quotas.html
 http://code.google.com/appengine/docs/roadmap.html
 http://code.google.com/appengine/kb/postpreviewpricing.html
 http://www.google.com/enterprise/cloud/appengine/pricing.html
 http://code.google.com/appengine/docs/quotas.html#Requests
 http://code.google.com/appengine/docs/adminconsole/performancesettings.html
 http://www.google.com/enterprise/cloud/appengine/pricing.html
 http://order.1and1.com/xml/order/ServerPremium

Thursday, September 1, 2011

Google App Engine New Pricing Sucks


Google has done a major disservice to its cult of developers by changing the pricing terms of App Engine ridiculously while giving developers short notice to react. In doing so, Google may have done severe damage to their brand and the trust that developers put in them.

Google released app engine in 2008 on a set of premises:
  • You should design and code your application for performance, because that is what you will be charged for.
  • The cost of spinning up your instances is negligible, and they will be spun up when needed and spun back down. 
  • Utilize Google's proprietary APIs, and don't worry about the lockin, because Google takes your trust seriously and we will not screw you over.
  • Your applications will scale while costing you little. 
  • If the service does not take off, we will give you three years notice to move your application away. Beyond that, you have nothing to worry about.

Sometime around May 2011, Google decided to change the rules of the game on everyone:
  • The pricing model would change. Instead of charging for performance (CPU), we would be charged for instances started. We were asked to change how we wrote our applications.
  • We were assured that the new pricing would probably raise our bills by up to 4X, but not anything like the 10X and above that we were getting vexed about
  • We were assured that a comparison billing will be released before the new pricing takes effect giving us enough time to make changes to our application or move off if the new pricing is not cost-beneficial to us
  • There was a tacit suggestion that Google would not betray our trust too badly, even as we were all very upset at the state of things
  • There was an acknowledgement that the new pricing model was not beneficial to Python applications (who were all of the early adopters), and that a new Python runtime would be released to mitigate things

Today, September 1, 2011, Google did the following:
  • They released the comparison billing, which showed many people's costs going through the roof (5-10X for large apps, 300X for small apps ie some bills went from $5000 to $26000, and some went from $60 to $2000, and some from free to $60)
  • They said the new billing would go into effect in 2 weeks (really, only 2 weeks heads-up?)
  • They gave us a 50% discount for 7 weeks (really, a 50% discount when pricing are going up by up to 30,000%)

It's inconceivable and downright insulting that they released a new pricing comparison tool that they have been promising only 2 weeks before it takes effect. This has to be one of the biggest "F$CK YOU" that I know about in a development community. The only option for most customers is one of the following:
  1. Pay Google their new rates with a significant loss to you (ie suck it up)
  2. Screw your own downstream customers by shutting down your app without notice or with very short notice
  3. If shutting down, you will still incur a ridiculous bill to download your data, because Google does not have a simple way to request your data i.e. you have to write an app that downloads your data over the internet and pay per the number of entities (regardless of the size of those entities). (Amazon allows you send them a drive, and they download your data and ship the drive back to you for a fair and known fee).

This just reinforced either one or two things that the community has been upset about:
  1. The App Engine Management team is clueless and terrible. 
  2. Google does not care about customer relations or maintaining trust or brand equity. I doubt they would have done this for a business customer base.

Google is a data-driven company. I have to believe that they did some analysis of the cost structure before releasing the new pricing. However, the way the new pricing was announced and the lack of clarity does not suggest that. It felt like someone came up this new new model, and attached some prices to it. They couldn't say clearly what they expected the impact of the new prices to be. However, now that they seem to have an idea, they are still bent on screwing us (the developer base).

There are a few things Google could have done to show they didn't just say tell their developer base "to hell with them".
  • Release the comparison billing 3 months before the new billing takes effect.
  • Release the new billing only after the Python runtime with concurrency has been released

Every software ecosystem thrives on the backs of their faithful developer base. Once you start screwing them over, you stunt and start to reverse the growth of this very important base. Google seems bent on screwing this developers (something I doubt Microsoft would ever have done in their heyday).

Unfortunately, a lot of the backlash has been on the Google Groups. We hoped that we could talk to Google in-house and let them know of our disappointment, and that didn't make a difference.

Now, complaining on Google Groups is like crying foul to ourselves. That is like an oppressed people telling themselves know that they are being oppressed. For Google to take this seriously, we will need to shout it outside and make this an issue outside our inner-circle. Let's all take to blogs, facebook groups, twitter, google plus, etc.

From my part, I have started the following:
  1. Blog Post:
    http://blog.ugorji.net/2011/09/google-app-engine-new-pricing-sucks.html
  2. Google Plus Post:
    https://plus.google.com/115545205253432230935/posts/7oJGwFEgDgC
  3. Facebook Page:
    http://www.facebook.com/pages/GoogleAppEngineNewPricingSucks/226690334046342
  4. Twitter Hashtag:
    #GoogleAppEngineNewPricingSucks 

Other posts I have seen about this are linked below:

Also, I don't know the legal ramifications, but maybe a class action suit might be in order. In my opinion, we were deceived by Google, plain and simple. To make matters worse, Google is giving us a set of unfair options, preventing us from leaving gracefully. Two weeks notice is inconceivable for development work that has been going on for years.

UPDATES:
Sept 8, 2011: Objective Gripes with New Google App Engine Pricing
This attempts to objectively address areas where the new App Engine pricing may not not fair, and what Google may do to alleviate these concerns.