Practice English Speaking&Listening with: Google I/O 2011: 3D Graphics on Android: Lessons learned from Google Body

Difficulty: 0

NICO WEBER: Welcome to the last time slot at Google I/O.

I'm Nico.

I'm glad you guys could make it.

So did you have a good time so far?

Any favorite sessions so far?


NICO WEBER: Yeah I saw that one.

I really liked that.

So you might have noticed it's kind of hard to

get tickets for I/O.

One of the easiest ways for me to get in was to give a talk.

So that's why I'm here.

And it's very different to watch all the talks when you

know you'll be talking later.

So one thing I've been paying attention to a lot is what the

presenters do with their hands, knowing that I have to

do something with my hands.

So all the professional presenters did like this.

I guess they read some book on body language and read that

this means open and relaxed.

And all the more engineering types of guys were like this.

And then they did a stand in between.

And I guess my conclusion is, talks are more fun if you

don't pay attention to the hands of the

presenter so much.

So don't look at my hands.

I'm going to be talking a little bit about 3D graphics

on Android.

It Earlier this year I ported Google Body to Android 3.0.

And I'll just share my experience there, I guess.

So who here has Google Body?

Quite a few.

So for those who haven't, if you just do a web search for

Google Body and pick the first hit on most search engines,

you'll go to this thing here.

And it's basically a human anatomy app.

Users call it Google Earth for the body.

So there's a 3D app of the human body.

You can zoom in, pan around.

There's a transparency slide on the left here, where you

can look at skeleton and whatnot.

There's a search box up here where you

can search for stuff.

And you might not have known that the liver has kind of an

interesting 3D structure from the back.

You can click on things to learn how they're called.

So this is the colon.

And that's basically Google Body.

For April 1 we had Google Cow, which was kind of popular.

So it will be a little while until it loads.

So that's the same thing for a cow.

It was pretty popular so we left it in the app.

So that's Google Body.

So Google Body is obviously a web app.

It lives in the browser.

And for 3D display, it uses a technology that's pretty

that's called WebGL, which was also demoed in the keynote

this morning and there were a few talks on that.

So there's no plugins or anything needed for that.

You just need a new browser.

So for example, Chrome supports WebGL.

Firefox 4 does.

WebKit, which is the Safari prerelease version, I guess,

kind of, supports WebGL.

There's an Opera 11 preview that

supports WebGL on Windows.

But sadly the Android browser does not support WebGL yet.

Google Body is a 20% project by about

five people at Google.

So Google has this concept of 20% time.

One day of the week you can work on whatever you

want, if you want.

And they were looking for someone to make Google Body

happen on Android.

So I figured, yeah that sounds like fun.

I'll do that.

And let me show you how is looks.

So Google Body for tablets is available in the market today.

So if you're looking for something to do with your

tablets, you can download this.

And it's basically the same thing.

So there's a 3D view of a model that

you can move around.

You can zoom in, zoom out.

Look at different layers up here.

You have a search box where you can--

I don't know-- search for skull.

Oh, it's right there.

You can tap on things.

So these things here are called teeth.

And there's a fun bug where I don't do modular


So if you spin the model a bit and then click on the reset

view button it spins a bit too often.

So that basically Google Body for Android.

The cow is not in there yet.

But it'll come eventually.

So that's what I did.

And currently this is tablets only and currently I am

working on getting this to work on phones.

And I'd like just to share my experience writing this a

little bit.

So Google Body was released December 2010.

I did the port after that.

So they send out a mail saying, anyone interested in

porting this to Android?

And I was like, yeah, if nobody else stands up.

Sure I'll do it.

And then they told me, awesome.

And we want this for tablets and you have two weeks and go.

So my point is, I don't have a ton of Android experience.

So I'm not on the Android team.

What I'm saying is my personal opinion; not an official


It might be factually wrong.

Parts are, probably.

And what I'm mostly focusing on is doing

3D graphics on Android.

I kind assume that you are somewhat

familiar with Android.

So who here knows what an activity is?



Who here has used OpenGL before in any?

Also most people.


Who here has done OpenGL on Android?

OK, so not as many.

That's perfect.

So I think this talk is perfect for you if you have

some experience with Android, some experience with OpenGL,

but not so much with the combination.

And if you are completely new to Android, I gave a version

of this talk at the Game Developers Conference

earlier this year.

And if you just do a web search for GDC 2011 Android

OpenGL, you'll find this page, which has a slightly more

basic version of this talk with uglier slides.

So Google Body for Android is a native Java app and it uses

OpenGL ES 2.0 for the 3D display.

So let's see what I'll be talking about.

So I'll very quickly tell you what OpenGL ES 2.0 is.

It's actually faster than saying the whole

word, OpenGL ES 2.0.

And then I'll give you a very, very rough

mental model of GPUs.

Tell you a few pitfalls with textures.

A few best practices and pitfalls with geometry, that

is Vertex Buffer Objects.

Then I'll tell you quickly how to quickly get data into

Vertex Buffer--

into ByteBuffers, which you need to

upload them to the GPU.

And then I'll say a few words about performance tweaks.

So OpenGL ES 2.0.

So I guess everyone here knows OpenGL.

It's looks like this, right?

Looks familiar to anyone?


So OpenGL is basically the 3D API.

There are implementations on Windows,

MacOS, Linux, many phones.

It's been around forever, so it's very versatile.

As I said, it's been around for a long

time, 20 years I think.

And it has accumulated some crap during that time.

And they are currently cleaning that up, but by the

time they wanted to do 3D on phones, OpenGL was kind of

messy, so they decided to release mostly a subset.

OpenGL for Embedded Systems. That's what the ES stands for.

And OpenGL ES is basically OpenGL with fewer functions.

So they got rid of glbegin and many other things.

And there are two versions of OpenGL ES: OpenGL ES 1

corresponds to OpenGL 1, more or less.

And it has a fixed function type line.

So that means every model you draw will do vertex transform

and rasterization and some predefined lighting

functions and so on.

And there's OpenGL ES 2, which roughly corresponds to OpenGL

2, which has fully programmable vertex and

fragment shaders and all that.

And Android supports both OpenGL ES1 and OpenGL ES 2.

If you do a web search for Android OpenGL you'll find

some official Android that tells you-- that proudly


Android supports OpenGL ES 1.

And that's factually not wrong, I guess, but it also

supports OpenGL ES 2 and that's what you

want to use in practice.

And I think they are updating their documentation there but

they are not there yet.

And just as an aside, WebGL is basically binding

for OpenGL ES 2.0.

So in theory, mobile browsers could support WebGL in the

future but they don't yet.

And WebGL is very exciting but nothing that I'll talk about

in this talk.

Because Body for Android doesn't use it.

So as I said, I'm currently porting

Body 2 to mobile phones.

So I kind of need to decide which Android versions I want

to support.

If you're just writing a tablet app, you just support

Android 3.0 That's easy.

But for phones you need to take a look at this chart,

which is at

And Android 1.5 and 1.6 are less than--

I think are about 5% of the market share these days.

So I don't think it's really worth supporting.

Android 2.1 is, I think, about 24%.

Which is pretty sizable.

Android 2.2 is at 65-ish% and Android 2.3 is 4%.

And that adds up to about 100, I hope.

So Android 2.1 is the first version of Android that

supports OpenGL ES2.

But only in the native code.

So they are no Java bindings or anything like that.

So if you want to do OpenGL ES 2.0 and support Android 2.1

you need to add your own driver bindings, which is not

hard but annoying.

And I personally haven't used Android 2.1 at all yet.

So I won't say a lot about Android 2.1 or anything.

Android 2.2 is the first version that adds Java

bindings for OpenGL ES2.

So I think that's a reasonable lower bound, at least for the

first iteration of your project to target.

It also added support for compressed textures, or added

API support for compressed textures.

And many other cool things.

And finally, Android 2.3.

From a Java OpenGL perspective,

added only bug fixes.

If you're writing native code 2.3 added a lot of cool stuff.

But for just graphics applications like Google Body,

I think Java is fast enough.

You're just pushing data through the

graphics card anyway.

And Java is kind of like, the better paved way to write

Android applications.

So Google Body is written in Java.

And my plan is to port it to 2.3 first and then if stuff

works there reasonably well then get it working on 2.2 and

maybe eventually onto 1.

I think no new project should use OpenGL ES1.

I think 90% of all phones support OpenGL ES2.0.

All new phones support it.

And if you feel that you really want to support the

last 10% that don't support the OpenGL ES2 these phones

are also pretty slow.

Weak CPUs, weak RAM.

So you probably are writing a second, lo-res version of the

app anyway.

So I think every new app should go OpenGL ES2.

So let's take a little look at how you actually do this.

So the class that does OpenGL rendering in Android is


And actually it's pretty easy to use.

In your activity and your onCreate method, you just

create a GLSurfaceView.

And then you say, setEGLContentClientVersion to

inform the view that you want to use ES 2.0, which has the

programmable shaders and all that.

And then you set a render object, which is your own

class that implements

GLSurfaceView.render interface.

We'll get to that in a second.

And then you also forward on parse and on resume to the

view so that when your application goes in the

background it stops rendering and that stuff.

So that's all you have to do in your activity.

Then in your manifest you just add uses-feature ES Version

2.0, require true.

And that way the market knows that the application requires

OpenGL ES2 and it will only show it to phones

that support that.

And finally you need to write your own little renderer.

So if you're using OpenGL ES2 you call static

functions on GLES20.

So I recommend doing an import static for

everything in there.

And then you can just so normal OpenGL calls like you

used to do that in other languages.

So you don't have to do GLES20 dot GLClear or whatever.

You can just write GLClear.

And this interface has three methods.

One is on surface created, which is called when your

context is first created and then a couple more times.

We'll get to that in a second.

And there's onDrawFrame, which is called every time you

should render.

By default this is called 60 times per second [? GLE. ?]

But you can also tell the system to only draw

your view on demand.

And there's onSurfaceChange, which is not very interesting

in practice.

So I'd like to do a tiny demo of how this looks in practice.

Coworkers inform me that it's too risky to go switch back to

Eclipse for demo.

So I'll just do this right on my slide.

So onSurfaceCreated will do GLClear, color, and a line--

some redder shade of gray.

And I'll also call the view to not draw at 60 frames per

second but only when needed.

And in here we'll just clear the background.

If I click this run button, hopefully the code will be

copied into some Java file in the background and then

uploaded to the tablet.

So it still says compiling.

So it says uploading.

Let's switch to the other box.

Now I just opened the IO OpenGL app.

I switched slightly too slowly to see it starting.

And that hardware makes an accelerated flashlight app.

So, mission accomplished.

No, at just one frame because it's when dirty.

So that's the OpenGL Hello world, I guess.

And that's about 20 lines.

Not too bad.

So one cool thing that GLSurfaceView gives you is

that it creates an educated renderer thread for you.

All the GL stuff will execute on the renderer thread.

Which means that if your UI thread is overloaded, you

still have smooth rendering.

And if your rendering is kind of slow, your app is still

responsive to tap events and all that.

So one thing that you need to do every now and then is to

relay an event from the UI thread to the OpenGL thread.

Because UI land and GL land are kind of single threaded,

so every OpenGL call has to be done on the GL thread.

Every UI call has to be done on the UI thread.

So for example, on click, I guess that should be on touch

or something like that.

When on touch is called on your UI thread, you might want

to tell the renderer to draw--

I don't know-- a particle system at the touch location

or something.

So you need to somehow relay the event from the UI thread

to the renderer.

So the way you do this, you just call

.queueEvent on the GL view.

And pass a Runnable.

And then this will be executed on the GL thread.

So if you want to, for example, access item in here,

then Java has this mutation that int has to be final.

So you just put a final in there.

And then you can just use item in here on the other thread.

One little pitfall there is, if you want use a class that's

passed in, for example on a touch event, and you just do

final touch event event.

And then use event down here.

And then by the time the GL thread executes the runnable,

the UI thread has already reused the

touch event up here.

Changed it internally and passed it on

to a different view.

Because the UI thread reuses objects so it doesn't allocate

memory all that much.

So by the time your renderer looks at the touch event

object, the data is all wrong for you to use.

So you should make a copy of all parameters and then have a

final local variable and use that in the runnable.

The other direction from the renderer thread to the UI

thread isn't needed all that often.

Just for completeness you can do activity dot run on UI

thread and pass in the runnable and then this is

executed in the UI thread.

In Body I used this, for example, when you would touch

muscles, I need to see what muscle was tapped.

And so I basically render the polygons in some made up

colors and then I read the screen and see what color was

below the finger and then have them mapping colors to objects

and then tell the UI thread, this thing was touched.

Use GLSurfaceView.

My advice.

That makes happy Android for that, which is, I guess, kind

of like a gold star.

It's very easy to use.

It gives you a dedicated renderer thread for free and

it's very well tested.

So some people on the internet recommend that you run your

own little surface holder thing.

For example, Chris Pruett, who talked here earlier today, has

an open source game called Replica island and he has his

own GlSurfaceView fork.

And he has one screen full of comments about something that

went wrong.

Like, a few graphics drivers misbehaved under very specific

circumstances and it took him two weeks to track that down.

So don't be Chris Pruett.

Use GLSurfaceView.

A little word of warning though.

GLSurfaceView loses its OpenGL context very often.

So every time you call onPause it'll forget all OpenGL state,

like uploaded pictures and so on.

It'll call onCreate on your renderer object and then you

need to re-upload all your pictures and so on and that

can be slow.

So make that fast. If you're talking 3.0 or later you can

call types setPreserveEGLContextOnPause.

But if your device supports only one OpenGL context and

the user switches to another app that uses OpenGL and he

switches back to your app then your stuff is gone anyway.

So make loading your data is the lesson here, I guess.

Alright so that's basically the OpenGL Hello world.

Here's a very high level picture about how GPUs work.

So up there there's the CPU, which executes your Java code.

And then there's this OpenGL API, where all the data that

needs to be rendered needs to be pushed through.

And then the data ends up in graphics memory here.

And then the GPU reads vertically to there, runs

vertex shaders, rasterizes all the varyings, sends them

through the fragment processor, which runs all your

fragment shaders.

And that's written to the frame buffer.

So as I said, this is very simplistic.

There's no planning stage in here.

On some GPUs, vertex processors and fragment

processors are executed on the same

silicon and thus is shared.

But basically my point is, you want to send not a lot of data

over this bus because that's very slow.

And also many GPUs cache vertex data pre-transform,

post-transform, they cache textures.

So to make these cache efficiently, you also want to

keep your data very small.

and if it And that's basically how GPUs work.

Now you know.

My point basically is, don't send lots of data to the GPU

on every frame.

And if you do, then don't do it in many small calls.

Just do big, bursty calls.

So here's a piece of OpenGL 101 that I

think everyone knows.

If you do GLTexImage2D with a texture data at the end to

basically set the current texture and then you draw your

model, then this will upload the current

texture every frame.

And that's expensive, so don't do that.

Instead, in your own surface creator method, you create an

identifier for the texture, which is just an int.

You tell OpenGL make this, make texture--

I don't know-- number five current.

Then you upload the data once into texture number five.

And then in your onDraw method you just find

texture number five once.

And then you call it raw model.

So everybody knows that.

I'm saying this because nearly the same is true for vertex

buffer objects later and it's not as well known there.

You should also use texture compression.

So ETC, which is short for Ericsson Texture Compression.

So what the heading says is use Ericcson Texture

Compression for RGB texture compression.

An extension to OpenGL ES2 that's supported on virtually

all devices out there.

Or on all devices that I know of.

And if you use ETC even every pixel need only four bits,


So that's compared to 16 bits per pixel.

That's a 75% memory win.

And sadly, iPhone, this isn't documented very well.

So I didn't know about this.

So I launched Google Body without doing this.

Then I read about this, enabled texture compression

and that saved, like, 10 megabytes of Ram, which is

quite a bit.

So there's this binary ETC 1 tool in the Android SDK tools

folder that I didn't know about.

So when I used this the first time I did a web search for

ETC 1 compression and I found some binary on some Erikson

website that ran only on Windows and concluded the

source code didn't build on MacOS so I patched

that and used this.

Turns out there's a binary in the Android SDK.

It's just nobody tells you.

Nobody told me, at least. And if you then add

supports-gl-texture to your manifest, then the market

knows about this.

And the Android is happy again.


And it's very easy to load textures.

So on your I/O thread you just do ETC1utilcreatetexture and

pass in input stream.

And then on your GL thread, this loads the texture into

memory and then on your GL thread you can upload this.

Obviously you never want to do I/O on the UI thread or the GL

thread because I/O can be unpredictable and might just

take 100 milliseconds and you don't want your UI or your

rendering to stutter, so you should always have a

dedicated UI thread.

One small word of warning.

If the width or the height is not a multiple of four, then

the PowerVR GPUs just display noise for your texture.

So for example, PowerVR is used on

the Nexus S for example.

In practice that's not a huge problem because most textures

are power of two sized anyway.

And most powers of two are also multiples of four.

And for heads-up displays, you can make your texture sizes a

multiple of four.

Something to keep in mind.

So now we know how to upload textures.

Now the same for geometry.

So same thing as for textures, if you do

glvertexAttribPointer and pass the attrib data in the last

parameter here, then this uploads all the

vertex data to the GPU.

And if you do this on every frame, then you are copying

lots of data around.

So don't do that.

And this is, for some reason, less well known.

The OpenGL ES2.0 example and Android SDK does this.

So don't look at that example.

I guess the excuse is OpenGL ES1 only supported this way

and they haven't updated this since then, I guess.

So instead, just like with textures, you create a numeric

ID then you bind this.

So Array_BUFFER is used for attributator like positions,



And then you do GL Buffer data with your data.

And the same for the indices.

Then at run time you just bind the array buffer and the

element array buffer.

And you pass zero for the last parameter instead of data.

And that's way faster.

So it's quite faster.

So two things to keep in mind here.

One is you need to have this attrib data and this index

data somehow.

And these need to be direct ByteBuffers, which I'll talk

about in a second.

And then, as I said, in call GL Vertex attrib pointer with

a zero back here and GL draw elements

with a zero back here.

And if you run this on FroYo, your application will crash.

And the reason for that is that they forgot to add the

bindings for these two method calls.

So it compiles just fine.

But at run time, when Android tries to call the c method

that backs this OpenGL draw, it doesn't find anything.

Which is a bit annoying but it's pretty easy to fix.

You basically need to add your own bindings

for these two functions.

So if you're familiar with the NDK, that's pretty easy and if

not then I guess it's kind of magic.

You just copy paste and you're done.

Who here has used the NDK?

Not many people.

OK, basically what you do is you create a normal Java class

and then you put your method there.

But instead of putting implementation there you put

native in front.

And this tells Java that this method exists.

It takes these parameters and that it should look somewhere

else for the implementation.

It's not implemented in Java.

And the same for the other function that's missing.

And then you do a system dot load library down here in the

static initializer.

And then this is something you write with the NDK.

So you create a jni subfolder in your project and you paste

in this bit of code.

So there's a function with this weird naming convention--

Java_com_example _io_GLE20Fix_glDrawElements.

And Java uses this function name to

associate it with a class.

So it starts with Java then it has the package name,

Then it has the class name and then it has the method name.

In here you put the implementation of your method.

So the first two parameters of jnimethod are always

jnin and jclass c.

And then the rest are the parameters from the functions.

So this is int, int, int, int just like here,

int, int, int, int.

Only with a j in front.

So JNI is Java Native Interface, which is basically

the technology you use to call C from Java.

And then we just call the C function

for our GL Draw elements.

And exactly the same for vertex attrib pointer.

And then you copy this file, put that also

in your JNI folder.

Go into that folder and then you call NDK build from the

NDK and this will create some library.

And then you do a clean build in Eclipse, which will pick up

that library and copy it into your APK and then you can call

GLE20Fix_glDrawElements and then that works on FroYo.

So that's that.

Let me say a few words about filling ByteBuffers.

So ByteBuffers are the things that you pass through GL

Buffer data.

And it's basically a block of raw C memory.

So if you're not familiar with--

Who here knows C?

Are the same people who used the NDK, roughly.

No surprise.

So Java obviously has managed memory.

C doesn't.

And in some JVMs, these memories live

in different areas.

And OpenGL needs to have the raw C memory, for some reason.

You just need to know you need to use direct

ByteBuffers for that.

And it turns out, doing element-wise access on these

is pretty slow.

So you get a ByteBuffer by doing

ByteBuffer.allocatedirect and then some size.

And then if you want to load data from a resource into a

direct ByteBuffer, don't just get the input stream and

element-wise put stuff.

Basically don't read one byte from the input stream and put

it into the direct ByteBuffer.

This is very slow for some reason.

Behind the scenes this does several method calls.

One JNI hop and so on.

It's much better to do this in blocks.

So in Body I think I used four kilobyte code blocks.

And this sped up loading by, I think, 8 seconds.

So it's still a bit slow, but it's done in in parallel so

that's fine.

And you can do even better than that if you are willing

to make some compromises.

So as you might know, APK files are just Zip files.

And if you give your resources some magic extensions, your

resources won't be compressed in the zip file.

They will just be an uncompressed part of the zip

file somewhere.

So for example, PNGs and JPGs are compressed already, so

they aren't recompressed again.

And also the extension, JET, is one of these magic


I have no idea what file format this actually is.

But if I want to have a resource that's not compressed

I call it dot JET and put it in my response folder and then

it's not compressed.

And the cool thing about uncompressed resourced is that

they are basically just a chunk of your APK file.

And you can get a file handle through that.

You begin to get a sense that openfd, which gives you a set

file descriptor from which you can get a file input stream

instead of just an input stream.

And from a file input stream you can then get a channel and

channel you can mmap.

And mmap returns a MappedByteBuffer and

MappedByteBuffers are always direct.

So in this case no conversions at all have to be done.

You can just use this and pass this through GL buffer data.

And this is another 10x or so faster than

the previous thing.

So if you're willing to not to compress your resources you

can have really, rally fast loading this way.

So a small word of warning.

ByteBuffer dot allocateDirect allocates more memory than you

tell it to.

So if you just do a tiny test program that does ByteBuffer

dot allocateDirect with 15 megabytes and then look at

Lock Add and Lock Add will tell you I paid

to allocate 60 megabytes.

So it overallocates by a factor of four.

Which is an Android bug that's being fixed, I think.

But not yet.

So keep your buffers small, I guess.

In Google Body if you look at the market page, there are two

one-star comments that tell you this app is crap.

It's crashes all the time.

And that was because of this bug.

Basically when Body was loading and people pressed on

the screen, a lot of time.

So as I said, to do touch detection i basically rendered

the whole scene into a back buffer.

And I created an off-screen buffer for the whole screen,

which is about a million pixels.

And then two bytes per pixel for color, two bytes per pixel

for that buffer.

So that's about four megabytes.

With over allocation by 4x that's 15 megabytes.

And if loadings going on in parallel,

that's too much memory.

So Body crashed with out of memory.

And I fixed that by not rendering the whole screen

into back buffer but only the 20x20 pixels

around the touch event.

So just something to keep in mind.

And one thing that I also learned is that, if you don't

have many users and you get two one-star ratings, that

really hurts.

I used to have a 4.5 average and then then it went down.

Tough times.

Another pitfall: compressed files can be,

at most, one megabyte.

And uncompressed on Android 2.2.

And the reason that is, I guess, is because the Android

guys have a static buffer that's one megabyte that they

use to uncompress in.

And if the uncompressed size is larger than that they say,

sorry you can't do that.

So the things you can do there are, split your files into one

megabytes chunks.

Which kind of sucks, so I wouldn't do that.

Or you can basically use uncompressed resources.

And then if you really need the compression you can

compress them yourself and uncompress them yourself and

you can be smarter than the Android guys and use.

I hope everybody knows how to write a decompressor.

Or how to use zlib, which does the decompression for you but

don't have a static max size buffer.

So they fixed that in 2.3.

And that's that about ByteBuffer.

So that's already our last section.

We're doing fantastic on time.

So I'd like to say a few words about performance here.

The first word is measure.

So if you're trying to do performance improvements,

always measure if they actually help and if they

don't then don't do them.

And I have a little demo for that.

About a little pitfall, I guess, when you're measuring

performance, that I found.

So this little program here, basically just clears the

color in the depth buffer seven times per frame.

Which is obviously not a very useful thing to do.

But it's interesting for measuring performance.

As you might know, tablets are fill rate limited, and this

can give you an idea of how much fill rate you can get and

the best case.

So it turns out seven clear screens is the upper bound you

can do to still get 60 frames.

So if you draw every pixel seven times per frame, you

probably won't get 60 frames per second.

And that's with the cheapest filling possible, right?

Normally you'd also do some geometry

transforms and whatnot.

So I was interested in finding out what this number here is.

So I wrote this program.

And let's run this.

So what this will do, it will, again, compile the thing and

upload it to the device.

The device will measure how fast it's drawing and send

that back to the laptop and it'll hopefully

show up here on screen.

And for demonstration purposes, the app apparently

measures the frame time every frame and sesnd it.

So normally you'd want to measure for the last second

and display an average for the last second.

But if you do this for every frame, you'll

see a curious thing.

Every frame either takes exactly 1/60 of a second or

1/30 of a second.

So that oscillates between 60 frames per second and 30

frames per second.

Or if you think a millisecond per frame is better, either 16

or 32 milliseconds per frame.

And I'm not sure why that is, exactly.

But my theory is that the, oh--

gold star!

Someone suggests it's the Vsync.

So Vsync is what the old tube monitors use, I guess.

So I guess there's some kind of double buffering going on

and the compositor that draws the Android interface

basically only wants to render at 60 hertz.

And if your frame takes just a millisecond longer than 60

milliseconds, then you have to wait for the next time Android

allows you to paint.

And this makes it kind of hard to do performance

measurement, right?

Because if you're one millisecond too slow then you

pay another 60 millisecond a seconds for your frame.

And that makes it hard to evaluate if any rendering

changes actually value performance.

And as it turns out there's some hack that happens to undo

this effect somehow.

So I guess it somehow enables triple buffering, but I don't

know what's going on there exactly.

I stumbled upon this.

So this hack is done by this function, which I'll show on

the next slide.

So if you do this call here.

And it's compiling again, uploading again.

And now you see that this is a pretty constant function.

Just a little bit about over 60 milliseconds, which caused

this jiggering.

So since I don't really know what this function does up

there, I wouldn't recommend using it in your shipping


But it's pretty useful for doing performance

measurements, right?

So I guess double buffering is what's causing this, somehow.

But who knows?

If you call egl--

you need to call some function that's not

exposed through Java.

So you need JNI again.

All this code is on some Google code site and I'll post

the link at the end so you can play with this at home.

So if you call eglSurfaceAttrib Swap Behavior

preserved, then this somehow magically disables something,

or enables something, that allows you to do better

performance measurements.

If you do a web search for swap behavior, then I think

there's one page on this.

And this page tells you never use GL buffer preserve because

it makes things slow.

And I guess that's true.

But on some hardware it allows you to do useful time


So this is on the Tegra 2 on tablets.

I guess also on your Samsung that you got

also used Tegra 2.

If you run this on a [UNINTELLIGIBLE]

it doesn't support this attribute and just crashed.

So it's very dangerous but useful for measurement.

So measure your stuff.

Now that we know how to measure, let's see how we can

improve performance.

So here are the basics.

You always want you vertex buffer objects.

So don't upload your vertex data every frame.

Instead upload them into a VBO and then only upload the

integer into OpenGL.

Always use index geometry.

So as most of you will know, when you render two triangles

that are right next to each other, you basically first

send these three vertices to the GPU and

then these three vertices.

And if you send the four vertices, then you're

basically sending this and this vertex twice.

And that's expensive.

So in practice you usually have only send indexes.

You say, draw a triangle with vertex one, two, three and

then one, three, four.

And that way you only need to transfer the index twice,

which is almost always a win.

So do that.

OpenGL gives you the flexibility to either, order

your vertices by basically have one chunk of memory,

where all the vertex positions are and then another chunk of

memory, where all the normals are, another chunk where all

the texture coordinates are.

But don't do that.

You should always keep one vertex in a small, contained

element of memory.

So you want to have vertex position right next to normal

or texture coordinate.

And then, as I said, there aren't many caches

on some of the GPUs.

So you want to keep your attributes small.

So for normals you can usually get away with just assigned

u8, so assigned byte is usually enough

resolution for normal.

For texture coordinates, you might get

away with half loads.

So half loads are not officially supported by ES

2.0, but like ETC textures they are supported virtually


So think about doing this.

Also, since your code will run on different devices, with

different frame rates, you should make animation

time-based not frame rate-based.

So if you have some animation and some device renders your

app at 30 frames and the next at 60 frames, the animation

should take the same length and not be twice as fast just

because the device renders twice as fast.

So that's the basics, basically.

And now once you've written your app and it's kind of

slow, the first thing you do is you set the glViewport to a

1x1 pixel thingee.

And then either frame rate goes up or it doesn't.

If it does go up, then you are either fragment processor

bound or texture fetch bound.

And you differentiate that by making all your textures

really small.

And if stuff gets--

if that doesn't help then you are fragment processor bound.

And if that helps, you're texture bound.

So if you're fragment processor bound, there are a

few things you can do.

You can work from the fragment shader to the vertex shader.

In my experience, fragment shaders on mobile devices have

to be, like, one or two lines.

So you can't do lots of fancy effect there.

If you want to do very fancy lighting you can basically

pre-compute all your lighting formulas for the resizing the

texture and then do a texture look up instead of doing your

own calculations.

You shouldn't draw backfacing.

Strings that face the other way.

And you shouldn't use discard in your fragment shaders.

But the main point is do less work in you

fragment shaders here.

If you're texture fetch bound, if you're not using texture

compression yet, you should.

One thing that also helps is to use mipmaps because of

cache coherency.

And of course use smaller textures.

One thing I forgot to mention on the ETC slide, on the

texture compression slide is that ETC doesn't support an

alpha channel.

So if you have textures that use an alpha channel, then

there's a not a single compressed texture format that

works on all devices.

So in that case, you probably have to download the right

compressed textures on first run, depending

on the device type.

Or if you don't have many alpha textures, not use


But if you're running into this problem and not all your

textures are compressed then try that first.

So if you're not fragment processor bound, you're

probably vertex processor bound.

So if using a very small viewport doesn't really help

you, you're probably vertex processor bound.

In that case, use fewer small attributes.

So try using assigned bytes for your normals, and so on.

You can play with the position framework.

Position keyword in OpenGL ES.

You can do, instead of doing lighting for vertex, instead

of transforming the light vector into model space at

every vertex, you can, instead transform the light vector

once and then read the transformed light vector.

You can use level of detail.

And you can call objects that are outside of the viewport.

So that's all the--

I guess, pretty normal--

performance stuff that's also true on normal devices.

Finally if you are CPU bound, then use less CPU.

So one thing that's expensive, can be expensive, is if you

allocate memory a lot in your inner loops.

In that case, reuse memory.

Batch draw calls.

So don't have a for loop in your drawMethod type that

basically tells the GPU, draw this triangle, now this, now

this, now this.

And loop for all triangles.

Instead have one call that fetches all triangles.

And if all else fails, you can look at the NDK and try to

write native code for your time-critical functions.

In my experience, that doesn't help all that much.

And that's that, I think.

So thanks for listening.


Watch me type my password.

So code, slides and so on are available at this website.

If you do a web search for io 2011 OpenGL Android

it might show up.

So the project used to be hidden earlier today.

I don't know if it's visible now.

So we have these feedback links that are completely

impossible to pronounce.

So, if you want to tell me anything.

And that's that.

And I'll download Body for Android and play with it a

little bit.

So do we have any questions?


AUDIENCE: Do you know how to do OpenGL to a widget?

NICO WEBER: I don't.

I haven't looked at the widget stuff at all, yet.

AUDIENCE: Aside from using compressed textures, how can I

speed up the process of reloading my textures when my

surface is recreated?

NICO WEBER: How do you do the reading?

Do you just use the ETC1 text [UNINTELLIGIBLE]

to read the texture, or?


I'm writing for older versions of Android.

NICO WEBER: So one thing that I think might work, which I

want to do for Body but haven't done yet , is

basically you read all your texture data applications data

once and then you keep them in memory cache ready for upload.

AUDIENCE: A memory cache?

NICO WEBER: Yeah, basically.

You keep them around so just upload them immediately.

And if your activity is [? on low ?]

memory is called you drop these cache.

And then basically you have them in memory already and you

don't need to reload them.

That's something I would try.

AUDIENCE: I'd like to get some detail on that.


Maybe later.



Do you have any issues with transparency?

Because I know that it looks like Body makes pretty heavy

use of showing some kind of opaque model of

a translucent shell.

And in GL that can be tricky to get order right.


So that's a known deficiency with OpenGL.

So Body, I think, just doesn't care that much.

So it doesn't look perfect.

But it looks good enough, I think.

So basically, one thing you can use so that-- the usual

way to do this is to draw your non-transparent stuff first,

and then basically sort your transparent [UNINTELLIGIBLE]

on CPU and draw them back to front.

So that's slow, because you need sort stuff.

There's this depth peeling technique by Cass Everitt,

that means you need to render the C [UNINTELLIGIBLE]

for that.

So I don't think there's a good general

answer to that question.

You need to see what works for your app.

In Body, I just don't really care at the moment.

AUDIENCE: So what did you do for Body?

Did you just--

NICO WEBER: I just say GL bend mode one source of--

AUDIENCE: But you drew the opaque part and then just--

NICO WEBER: No, I just draw everything.

So Body basically has these layers.

There's organs, skeleton and so on.

And I draw the inner layer first and the outer layer in

transparent doing the [UNINTELLIGIBLE].

But per layer I just say, transparency

on and do your thing.

And I think I draw the opaque things first.



AUDIENCE: So have you considered using something

like the [?

SP3s so you can get the transparency right?

NICO WEBER: Yeah, I have considered it.

But it doesn't seem like the most critical thing I should

be working on right now.

So as I said--

AUDIENCE: Right, right.

As a 20% thing.

Your spare time, other than sleeping.

NICO WEBER: Well it's my Friday, basically.

So I thought about that, but I haven't done yet.

AUDIENCE: But I guess the question is, do you see

problems in trying to take that approach

with Java on GL 2.0?

Are you going to get hung up on computation?

Are you going to get hung up on

pushing the indices through?

NICO WEBER: Try it, I guess.

So writing a demo for that should take, maybe, two hours?

And then you know.

That's what I would do.

But I don't know.

So I guess if stuff turns out to be slow you can always go

to native code.

But it worked on really slow machines 12 years ago, or even

longer than that, so I guess it should work fine.

AUDIENCE: Have you considered or looked at Renderscript

by'the way?

NICO WEBER: So when I wrote this, 3.0, there was even less

documentation on Renderscript than there is today.

I think I had heard of a name, but nothing else.

So not really.

And also I think Renderscript is 3.0-only and

Android-only and so on.

So I think, not yet.

AUDIENCE: OK, thanks.


AUDIENCE: When you're using GLSurfaceView and on top of

which, you might want to use an Android 2-D graphics widget

or ListView, let's say, the performance drops


I can understand there's 2D computation, there's the 3D

computation in the background happening.

But have you guys thought about it?

Like how do you deal with this in the future?

How about out combining 2D graphic APIs and 3D graphic?

NICO WEBER: So with you guys, do you mean me the Google Body

developer or us the Android framework guys?

AUDIENCE: Generally Android framework.

NICO WEBER: So I have no idea what the

Android guys are doing.

I'm sorry.

AUDIENCE: Any tips and tricks you might have seen?


NICO WEBER: OK, so I am told to recommend the Office Hours.

So what Google Body does, if you tap things it draws these

little text widgets.

And I'm using OpenGL text just for that because I didn't want

to deal with mixing 3D and 2D.

But I think Google Maps puts 2D, which is on top of the

map, and the map is a GLSurfaceView,

so it kind of works.

So I guess it depends on if you're writing a game where

you really need that 60 frames per second.

And in that case, you don't want to put anything on top of

your thing.

Or if you're writing an app, in that case it might be fine.

AUDIENCE: OK, thanks.

NICO WEBER: More questions?

AUDIENCE: I've got a question about the cow.


AUDIENCE: Specifically, why are its teats on show,

compared to the female model?

NICO WEBER: So I don't know.

The web version did that.

I hadn't ported the cow yet to the tablet version.

So I haven't looked into that yet.

Thought I can see the cow on the tablet being really useful

if you go to a steak house you can be like,

can I have this piece?

So that's my motivation, there.

But I haven't had time yet.

What's that?

Oh yeah, that's pretty fancy, huh?

So locally on the notebook, there's a little ghost server

running that basically--

so I have a web socket connection to the local ghost

server and then it copies that into a Java file, invokes and

to compile this thing, then invokes ADB to copy it over

and then ADB lock add to [? grab the output for the frame

stuff, sends it back up the web socket.

Yeah, that's also on the slide project.

And this took way longer to do that than useful.

But oh, well.

Yeah, the question was, how did the run button work in the


More questions?

Come on guys, we have five minutes left.

No questions?

All right, then.

Thanks for listening, again.


NICO WEBER: Oh, there's one last question.

AUDIENCE: In the fill rate example, you cleared both the

color buffer and the depth buffer.

I mean if you, actually I was confused.

Does the fill rate work-- were you trying look for both depth

and color buffer?

NICO WEBER: Yeah so if you just clear the color buffer

then you can go higher than seven.

So that's faster, if that's the question.

But you can just try it yourself.

AUDIENCE: I mean like the depth buffer is like, you

don't write to it so often, is what I got.

It's much smaller than the color buffer, so--

NICO WEBER: I think the Tegra 2 has, it usually uses 16-bit

colors for performance for the color buffer.

And the Tegra also only has 16-bit C buffers so

it's those same size.

AUDIENCE: Both are 16-bit?

NICO WEBER: Both are 16-bit per pixel.


Thank you.

NICO WEBER: And phones actually have, I guess, 32-bit

def per pixel, so you can get some z fighting artifacts on

Tegra if you're not careful.

All right then, I'll just say thank you, again.

And usually someone else pops up.

AUDIENCE: It's not about the cow this time.

The question was on, did you try the fill buffer test with

textures as well and see what the throughput was on that?

NICO WEBER: I think I did and I think it was identical.

AUDIENCE: It was lower than 7 per?

NICO WEBER: I think it was the same.

AUDIENCE: There was no difference?

So the fill rate is identical for texture and

also uniform color?

NICO WEBER: I think so, yes.

Pretty sure I tried that.

But don't believe me anything.

Just try everything yourself.

It's easy and quick to do.

All right, that's all folks.


The Description of Google I/O 2011: 3D Graphics on Android: Lessons learned from Google Body