Dec 19, 2011

Final Project

Project Report can be downloaded here:
http://dl.dropbox.com/u/5870963/GPUFinal/finalproject_EricCheng.pdf

Video Presentation is on youtube:

Poster can be downloaded here:
http://dl.dropbox.com/u/5870963/GPUFinal/poster.pdf

And the source code is here:
http://dl.dropbox.com/u/5870963/GPUFinal/Source.zip
It is an Xcode project so you have to open it on a Mac with Xcode installed

Voila!

Happy Holidays!
Merry Christmas!

BTW, I'll improve my stuff in 2012, I'll be back ;-)
Thanks Joe Kider and all the classmates in CIS565, this is a wonderful course

Eric

Dec 18, 2011

Building a fat lib file for OpenCV

Last time we generated 4 lib files for release, debug, simulator and device. However, it's a better idea that we can use a single lib file in our project. Here is a solution, we can use lipo tool to achieve this goal. 
Universal binaries will be places in build/lib/universal folder. “build” folder is the name of the directory you’ve entered on the first stage when generating workspace in CMake. Include headers can be obtained by running install target in XCode in build/include.
Here is bash script that will merge them together:


# Create armv7 + i386 OpenCV library
 
mkdir -p build/lib/universal
 
lipo -create build/lib/Release-iphoneos/libopencv_calib3d.a    build/lib/Release-iphonesimulator/libopencv_calib3d.a    -output build/lib/universal/libopencv_calib3d.a
lipo -create build/lib/Release-iphoneos/libopencv_contrib.a    build/lib/Release-iphonesimulator/libopencv_contrib.a    -output build/lib/universal/libopencv_contrib.a
lipo -create build/lib/Release-iphoneos/libopencv_core.a       build/lib/Release-iphonesimulator/libopencv_core.a       -output build/lib/universal/libopencv_core.a
lipo -create build/lib/Release-iphoneos/libopencv_features2d.a build/lib/Release-iphonesimulator/libopencv_features2d.a -output build/lib/universal/libopencv_features2d.a
lipo -create build/lib/Release-iphoneos/libopencv_gpu.a        build/lib/Release-iphonesimulator/libopencv_gpu.a        -output build/lib/universal/libopencv_gpu.a
lipo -create build/lib/Release-iphoneos/libopencv_imgproc.a    build/lib/Release-iphonesimulator/libopencv_imgproc.a    -output build/lib/universal/libopencv_imgproc.a
lipo -create build/lib/Release-iphoneos/libopencv_legacy.a     build/lib/Release-iphonesimulator/libopencv_legacy.a     -output build/lib/universal/libopencv_legacy.a
lipo -create build/lib/Release-iphoneos/libopencv_ml.a         build/lib/Release-iphonesimulator/libopencv_ml.a         -output build/lib/universal/libopencv_ml.a
lipo -create build/lib/Release-iphoneos/libopencv_objdetect.a  build/lib/Release-iphonesimulator/libopencv_objdetect.a  -output build/lib/universal/libopencv_objdetect.a
lipo -create build/lib/Release-iphoneos/libopencv_video.a      build/lib/Release-iphonesimulator/libopencv_video.a      -output build/lib/universal/libopencv_video.a
lipo -create build/lib/Release-iphoneos/libopencv_flann.a      build/lib/Release-iphonesimulator/libopencv_flann.a      -output build/lib/universal/libopencv_flann.a
lipo -create build/3rdparty/lib/Release-iphoneos/libopencv_lapack.a build/3rdparty/lib/Release-iphonesimulator/libopencv_lapack.a -output build/lib/universal/libopencv_lapack.a
lipo -create build/3rdparty/lib/Release-iphoneos/liblibjpeg.a       build/3rdparty/lib/Release-iphonesimulator/liblibjpeg.a       -output build/lib/universal/liblibjpeg.a
lipo -create build/3rdparty/lib/Release-iphoneos/liblibpng.a        build/3rdparty/lib/Release-iphonesimulator/liblibpng.a        -output build/lib/universal/liblibpng.a
lipo -create build/3rdparty/lib/Release-iphoneos/libzlib.a          build/3rdparty/lib/Release-iphonesimulator/libzlib.a          -output build/lib/universal/libzlib.a
 
lipo -create build/lib/Debug-iphoneos/libopencv_calib3d.a    build/lib/Debug-iphonesimulator/libopencv_calib3d.a    -output build/lib/universal/libopencv_calib3dd.a
lipo -create build/lib/Debug-iphoneos/libopencv_contrib.a    build/lib/Debug-iphonesimulator/libopencv_contrib.a    -output build/lib/universal/libopencv_contribd.a
lipo -create build/lib/Debug-iphoneos/libopencv_core.a       build/lib/Debug-iphonesimulator/libopencv_core.a       -output build/lib/universal/libopencv_cored.a
lipo -create build/lib/Debug-iphoneos/libopencv_features2d.a build/lib/Debug-iphonesimulator/libopencv_features2d.a -output build/lib/universal/libopencv_features2dd.a
lipo -create build/lib/Debug-iphoneos/libopencv_gpu.a        build/lib/Debug-iphonesimulator/libopencv_gpu.a        -output build/lib/universal/libopencv_gpud.a
lipo -create build/lib/Debug-iphoneos/libopencv_imgproc.a    build/lib/Debug-iphonesimulator/libopencv_imgproc.a    -output build/lib/universal/libopencv_imgprocd.a
lipo -create build/lib/Debug-iphoneos/libopencv_legacy.a     build/lib/Debug-iphonesimulator/libopencv_legacy.a     -output build/lib/universal/libopencv_legacyd.a
lipo -create build/lib/Debug-iphoneos/libopencv_ml.a         build/lib/Debug-iphonesimulator/libopencv_ml.a         -output build/lib/universal/libopencv_mld.a
lipo -create build/lib/Debug-iphoneos/libopencv_objdetect.a  build/lib/Debug-iphonesimulator/libopencv_objdetect.a  -output build/lib/universal/libopencv_objdetectd.a
lipo -create build/lib/Debug-iphoneos/libopencv_video.a      build/lib/Debug-iphonesimulator/libopencv_video.a      -output build/lib/universal/libopencv_videod.a
lipo -create build/lib/Debug-iphoneos/libopencv_flann.a      build/lib/Debug-iphonesimulator/libopencv_flann.a      -output build/lib/universal/libopencv_flannd.a
lipo -create build/3rdparty/lib/Debug-iphoneos/libopencv_lapack.a      build/3rdparty/lib/Debug-iphonesimulator/libopencv_lapack.a      -output build/lib/universal/libopencv_lapackd.a
lipo -create build/3rdparty/lib/Debug-iphoneos/liblibjpeg.a       build/3rdparty/lib/Debug-iphonesimulator/liblibjpeg.a       -output build/lib/universal/liblibjpegd.a
lipo -create build/3rdparty/lib/Debug-iphoneos/liblibpng.a        build/3rdparty/lib/Debug-iphonesimulator/liblibpng.a        -output build/lib/universal/liblibpngd.a
lipo -create build/3rdparty/lib/Debug-iphoneos/libzlib.a          build/3rdparty/lib/Debug-iphonesimulator/libzlib.a          -output build/lib/universal/libzlibd.a

OpenCV and iOS Integration



In the project, I need to convert data between OpenCV image class and iOS image class. The best way to do it is via Objective-C category methods. Objective-C categories provide a means to add methods to a class, and any methods that we add through a category become part of the class definition.


So we can have an UIImage-OpenCVExtensions.h file

#import <UIKit/UIKit.h>
#import "opencv2/opencv.hpp"

@interface UIImage (OpenCVExtensions)

// Creates an IplImage in gray, BGR or BGRA format. It is the caller's responsibility to cvReleaseImage() the return value.
- (IplImage *)createIplImageWithNumberOfChannels:(int)channels;

// Returns a UIImage by copying the IplImage's bitmap data. 
- (id)initWithIplImage:(IplImage *)iplImage;
- (id)initWithIplImage:(IplImage *)iplImage orientation:(UIImageOrientation)orientation;

// Returns an affine transform that takes into account the image orientation when drawing a scaled image
- (CGAffineTransform)transformForOrientationDrawnTransposed:(BOOL *)drawTransposed;

@end

And the implementation is like this:
#import "UIImage-OpenCVExtensions.h"

static inline void premultiplyImage(IplImage *img, BOOL reverse);
static void releaseImage(void *info, const void *data, size_t size);


@implementation UIImage (OpenCVExtensions)

- (IplImage *)createIplImageWithNumberOfChannels:(int)channels
{
    NSAssert(channels == 1 || channels == 3 || channels == 4, @"Invalid number of channels");
    
    CGImageRef cgImage = [self CGImage];
    BOOL drawTransposed;
    CGAffineTransform transform = [self transformForOrientationDrawnTransposed:&drawTransposed];
    
    CvSize cvsize = cvSize(drawTransposed ? CGImageGetHeight(cgImage) : CGImageGetWidth(cgImage),
                           drawTransposed ? CGImageGetWidth(cgImage) : CGImageGetHeight(cgImage));
    IplImage *iplImage = cvCreateImage(cvsize, IPL_DEPTH_8U, (channels == 3) ? 4 : channels);       // CG can only write into 4 byte aligned bitmaps
    
    CGBitmapInfo bitmapInfo = kCGImageAlphaNone;
    if (channels == 3) {
        bitmapInfo = kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Little;        // BGRX. CV_BGRA2BGR will discard the uninitialized alpha channel data.
    } else if (channels == 4) {
        bitmapInfo = kCGImageAlphaPremultipliedFirst | kCGBitmapByteOrder32Little;   // BGRA. Must unpremultiply the image.
    }
    
    CGColorSpaceRef colorSpace = (channels == 1) ? CGColorSpaceCreateDeviceGray() : CGColorSpaceCreateDeviceRGB();
    CGContextRef bitmapContext = CGBitmapContextCreate(iplImage->imageData,
                                                       iplImage->width,
                                                       iplImage->height,
                                                       iplImage->depth,
                                                       iplImage->widthStep,
                                                       colorSpace,
                                                       bitmapInfo);
    CGColorSpaceRelease(colorSpace);
    
    
    // Rotate and/or flip the image if required by its orientation
    CGContextConcatCTM(bitmapContext, transform);
    
    // Copy the source bitmap into the destination, ignoring any data in the uninitialized destination
    CGContextSetBlendMode(bitmapContext, kCGBlendModeCopy);
    
    // Drawing CGImage to CGContext
    CGRect rect = CGRectMake(0.0, 0.0, CGImageGetWidth(cgImage), CGImageGetHeight(cgImage));
    CGContextDrawImage(bitmapContext, rect, cgImage);
    CGContextRelease(bitmapContext);
    
    // Unpremultiply the alpha channel if the source image had one (since otherwise the alphas are 1)
    CGImageAlphaInfo alphaInfo = CGImageGetAlphaInfo(cgImage);
    if (channels == 4 && (alphaInfo != kCGImageAlphaNone && alphaInfo != kCGImageAlphaNoneSkipFirst && alphaInfo != kCGImageAlphaNoneSkipLast)) {
        premultiplyImage(iplImage, YES);
    }
    
    // Convert BGRA images to BGR
    if (channels == 3) {
        IplImage *temp = cvCreateImage(cvGetSize(iplImage), IPL_DEPTH_8U, channels);
        cvCvtColor(iplImage, temp, CV_BGRA2BGR);
        cvReleaseImage(&iplImage);
        iplImage = temp;
    }
    
    return iplImage;
}

- (id)initWithIplImage:(IplImage *)iplImage
{
    return [self initWithIplImage:iplImage orientation:UIImageOrientationUp];
}

- (id)initWithIplImage:(IplImage *)iplImage orientation:(UIImageOrientation)orientation
{
    // CGImage requries either 8-bit or 32-bit aligned images
    IplImage *formattedImage;
    if (iplImage->nChannels == 3) {
        formattedImage = cvCreateImage(cvGetSize(iplImage), IPL_DEPTH_8U, 4);
        cvCvtColor(iplImage, formattedImage, CV_BGR2BGRA);
    } else if (iplImage->nChannels == 4) {
        formattedImage = cvCloneImage(iplImage);
        premultiplyImage(formattedImage, NO);
    } else {
        formattedImage = cvCloneImage(iplImage);
    }
    
    CGDataProviderRef provider = CGDataProviderCreateWithData(formattedImage, formattedImage->imageData, formattedImage->imageSize, releaseImage);
    
    CGBitmapInfo bitmapInfo = (iplImage->nChannels == 1) ? kCGImageAlphaNone : (kCGImageAlphaPremultipliedFirst | kCGBitmapByteOrder32Little);    
    CGColorSpaceRef colorSpace = (formattedImage->nChannels == 1) ? CGColorSpaceCreateDeviceGray() : CGColorSpaceCreateDeviceRGB();    
    CGImageRef cgImage = CGImageCreate(formattedImage->width,
                                       formattedImage->height,
                                       formattedImage->depth,
                                       formattedImage->depth * formattedImage->nChannels,
                                       formattedImage->widthStep,
                                       colorSpace,
                                       bitmapInfo,
                                       provider,
                                       NULL,
                                       false,
                                       kCGRenderingIntentDefault);
    CGColorSpaceRelease(colorSpace);
    CGDataProviderRelease(provider);
    
    self = [self initWithCGImage:cgImage scale:1.0 orientation:orientation];
    CGImageRelease(cgImage);
    
    return self;
}

static inline void premultiplyImage(IplImage *img, BOOL reverse)
{
    NSCAssert(img->depth == IPL_DEPTH_8U, @"depth not IPL_DEPTH_8U");
    uchar *row = (uchar *)img->imageData;
    
    for (int i = 0; i < img->height; i++) {
        for (int j = 0; j < img->width; j+= img->nChannels) {
            uchar alpha = row[j + 3];
            if (alpha != UCHAR_MAX && (!reverse || alpha != 0)) {
                for (int k = 0; k < 3; k++) {
                    if (reverse) {
                        row[j + k] = ((int)row[j + k] * UCHAR_MAX + alpha / 2 - 1) / alpha;
                    } else {
                        row[j + k] = ((int)row[j + k] * alpha + UCHAR_MAX / 2 - 1) / UCHAR_MAX;
                    }
                }
            }
        }
        row += img->widthStep;
    }
}

static void releaseImage(void *info, const void *data, size_t size)
{
    IplImage *image = (IplImage *)info;
    cvReleaseImage(&image);
}

- (CGAffineTransform)transformForOrientationDrawnTransposed:(BOOL *)drawTransposed
{
    UIImageOrientation imageOrientation = [self imageOrientation];
    CGAffineTransform transform = CGAffineTransformIdentity;
    CGSize size = [self size];  // already transposed by UIImage
    
    switch (imageOrientation) {
        case UIImageOrientationDown:           // EXIF orientation 3
        case UIImageOrientationDownMirrored:   // EXIF orientation 4
            transform = CGAffineTransformTranslate(transform, size.width, size.height);
            transform = CGAffineTransformRotate(transform, M_PI);
            break;
            
        case UIImageOrientationLeft:           // EXIF orientation 6
        case UIImageOrientationLeftMirrored:   // EXIF orientation 5
            transform = CGAffineTransformTranslate(transform, size.width, 0);
            transform = CGAffineTransformRotate(transform, M_PI_2);
            break;
            
        case UIImageOrientationRight:          // EXIF orientation 8
        case UIImageOrientationRightMirrored:  // EXIF orientation 7
            transform = CGAffineTransformTranslate(transform, 0, size.height);
            transform = CGAffineTransformRotate(transform, -M_PI_2);
            break;
        default:
            break;
    }
    
    switch (imageOrientation) {
        case UIImageOrientationUpMirrored:     // EXIF orientation 2
        case UIImageOrientationDownMirrored:   // EXIF orientation 4
            transform = CGAffineTransformTranslate(transform, size.width, 0);
            transform = CGAffineTransformScale(transform, -1.0, 1.0);
            break;
            
        case UIImageOrientationLeftMirrored:   // EXIF orientation 5
        case UIImageOrientationRightMirrored:  // EXIF orientation 7
            transform = CGAffineTransformTranslate(transform, size.height, 0);
            transform = CGAffineTransformScale(transform, -1.0, 1.0);
            break;
        default:
            break;
    }
    
    if (drawTransposed) {
        switch (imageOrientation) {
            case UIImageOrientationLeft:
            case UIImageOrientationLeftMirrored:
            case UIImageOrientationRight:
            case UIImageOrientationRightMirrored:
                *drawTransposed = YES;
                break;
                
            default:
                *drawTransposed = NO;
        }
    }
    
    return transform;
}

@end


Dec 6, 2011

Canny Edge Detection using fragment shader

I implemented the canny edge detection on the iPhone with GLSL in the fragment shader. I got a lot of help from this paper http://www.dip.ee.uct.ac.za/prasa/PRASA2010/proceedings/2007/prasa07-26.pdf
And also from this GPU gem:
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter40.html
So basically the fragment shader looks like this:

float4 CannySearch(

  uniform float4 thresh,

  float2 fptexCoord : TEXCOORD0,

  uniform samplerRECT FPE1 ) : COLOR

{

  // magdir holds { dx, dy, mag, direct }



  float4 magdir = texRECT(FPE1, fptexCoord);

  float alpha = 0.5/sin(3.14159/8); // eight directions on grid



  float2 offset = round( alpha.xx * magdir.xy/magdir.zz );



  float4 fwdneighbour, backneighbour;

  fwdneighbour = texRECT(FPE1, fptexCoord + offset );

  backneighbour = texRECT(FPE1, fptexCoord + offset );



  float4 colorO;

  if ( fwdneighbour.z > magdir.z || backneighbour.z > magdir.z )

    colorO = float4(0.0, 0.0, 0.0, 0.0); // not an edgel



  else

    colorO = float4(1.0, 1.0, 1.0, 1.0); // is an edgel

  if ( magdir.z < thresh.x )

    colorO  = float4(0.0, 0.0, 0.0, 0.0); // thresholding



  return colorO;

}

And this works pretty well on the iPhone. I assigned each contour with different colors, so the edge looks more distinct:

Nov 28, 2011

UIImage and IplImage

In Cocoa Touch, UIImage is the class we use for images, however, in OpenCV, the corresponding class is IplImage. So we need to convert between the two classes:

We can create IplImage from UIImage with this function in Objective-C:


// NOTE you SHOULD cvReleaseImage() for the return value when end of the code.
- (IplImage *)CreateIplImageFromUIImage:(UIImage *)image {
  // Getting CGImage from UIImage
  CGImageRef imageRef = image.CGImage;


  CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
  // Creating temporal IplImage for drawing
  IplImage *iplimage = cvCreateImage(
    cvSize(image.size.width,image.size.height), IPL_DEPTH_8U, 4
  );
  // Creating CGContext for temporal IplImage
  CGContextRef contextRef = CGBitmapContextCreate(
    iplimage->imageData, iplimage->width, iplimage->height,
    iplimage->depth, iplimage->widthStep,
    colorSpace, kCGImageAlphaPremultipliedLast|kCGBitmapByteOrderDefault
  );
  // Drawing CGImage to CGContext
  CGContextDrawImage(
    contextRef,
    CGRectMake(0, 0, image.size.width, image.size.height),
    imageRef
  );
  CGContextRelease(contextRef);
  CGColorSpaceRelease(colorSpace);


  // Creating result IplImage
  IplImage *ret = cvCreateImage(cvGetSize(iplimage), IPL_DEPTH_8U, 3);
  cvCvtColor(iplimage, ret, CV_RGBA2BGR);
  cvReleaseImage(&iplimage);


  return ret;
}


And vice versa, we can create UIImage from IplImage in Objective-C:

// NOTE You should convert color mode as RGB before passing to this function
- (UIImage *)UIImageFromIplImage:(IplImage *)image {
  CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
  // Allocating the buffer for CGImage
  NSData *data =
    [NSData dataWithBytes:image->imageData length:image->imageSize];
  CGDataProviderRef provider =
    CGDataProviderCreateWithCFData((CFDataRef)data);
  // Creating CGImage from chunk of IplImage
  CGImageRef imageRef = CGImageCreate(
    image->width, image->height,
    image->depth, image->depth * image->nChannels, image->widthStep,
    colorSpace, kCGImageAlphaNone|kCGBitmapByteOrderDefault,
    provider, NULL, false, kCGRenderingIntentDefault
  );
  // Getting UIImage from CGImage
  UIImage *ret = [UIImage imageWithCGImage:imageRef];
  CGImageRelease(imageRef);
  CGDataProviderRelease(provider);
  CGColorSpaceRelease(colorSpace);
  return ret;
}


Using OpenCV on the iPhone

In my project, I'm doing a lot of image processing on the GPU, however, we still need OpenCV to do a lot of stuff. Here I'm writing down my experience of using OpenCV on the iPhone:

Building OpenCV 2.2 with Xcode 4 and iOS SDK 4.3
OpenCV 2.2 started to support Android SDK, but not iPhone. So we have to build it on our own. As iOS SDK doesn't support dynamic ".framework", the solution is to build it as a static ".a" lib file, and link it against the app statically

1. Install CMake. I used Homebrew as a package manager. I highly recommend it. As compared to Macports, it's much cleaner and more elegant.

2. Download OpenCV 2.2 from http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.2/OpenCV-2.2.0.tar.bz2/download

3. xjvf OpenCV-2.2.0.tar.bz2

4.
% cd OpenCV-2.2.0
% patch -p1 < ../OpenCV-2.2.0.patch

5.

% cd .. # Back to the top of demo project directory.
% mkdir build_simulator
% cd build_simulator
% ../opencv_cmake.sh Simulator ../OpenCV-2.2.0
% make -j 4
% make install

6.

% cd .. # Back to the top of demo project directory.
% mkdir build_device
% cd build_device
% ../opencv_cmake.sh Device ../OpenCV-2.2.0
% make -j 4
% make install

Using OpenCV library in the project
Actually it is more complicated than I thought. I googled for a long time and finally got it working.

Add libopencv_core.a etc, from OpenCV lib directory for either simulators or devices. Actually Xcode doesn’t care which one is for devices or simulators at this point because it is selected by the library search path.
Add Accelerate.framework which is used internally from OpenCV library.
Select your active build target, then open the build tab in the info panel by Get Info menu.
Add -lstdc++ and -lz to Other Linker Flags
Add path to OpenCV include directory to Header Search Paths for both simulators and devices.
Add path to OpenCV lib directory to Library Search Paths for both simulators and devices.

Ok. For now we're happy to use OpenCV on the iPhone!


Nov 21, 2011

OpenGL ES Presentation

OpenGL ES Presentation
View more presentations from Eric Cheng
Demo 1: My Rotating Cube based on Xcode OpenGL ES 2.0 template
Demo 2: Bumpy from iPhone 3D Programming

Oct 7, 2011

What this blog is about and final project pitch

Dear visitors,

This semester I'm taking the CIS565 course in Penn taught by Joe Kider. Joe is a wonderful tutor and he gave us various options to choose for the final project. And we're required to write blogs to update our development progress. And that's why this blog is born.

I thought about the final project for several weeks. I've been strongly interested in iOS development for 2 years, so I have the idea of combining iOS dev with GPU. Right now, the graphics processor in iOS devices is getting better and better for generations. In iPhone 4, Apple used a PowerVR SGX 535 GPU embedded in their A4 chip, which had decent graphics performance. This article compared iPhone 4 with the popular game handhelds, and iPhone 4 is obviously outstanding. Here's a chart below from Wikipedia describing the technical details of the iPhone GPU:

ModelYearDie Size (mm2)[1]Config core[2]Fillrate (@ 200 MHz)Bus width (bit)API (version)GFLOPS(@ 200 MHz)
MTriangles/s[1]MPixel/s[1]DirectXOpenGL
SGX520 Jul 2005 2.6@65 nm 1/1 7 250 64 N/A N/A 0.8?
SGX530 Jul 2005 7.2@90 nm 2/1 14 500? 64 N/A 2.0 1.6
SGX531 Oct 2008 ? 2/1 14? 500? 128 N/A N/A 1.6?
SGX535 Nov 2007 ? 2/2 14 1000? 64 9.0c 2.1 1.6
SGX540 Nov 2007 ? 4/2 28 1000 64 N/A 2.1 3.2
SGX545 Jan 2010 12.5@65 nm 4/2 40 1000 64 10.1 3.2 3.2?

We can learn from above, though mobile GPUs might not have stunning GFLOPS as desktop GPUs, they're already capable of handling a lot of things. Since 2007, the Khronos Group introduced OpenGL ES 2.0, which had a major upgrade over OpenGL ES 1.1. OpenGL ES 2.0 eliminates most of the fixed function rendering pipeline in favor of a programmable one. Almost all rendering features of the transform and lighting pipelines, such as the specification of materials and light parameters formerly specified by the fixed-function API, are replaced by shaders written by the graphics programmer. So this opened up all the possibility for the programmers to leverage the power of the iPhone GPU.

Right now, there're tons of apps on the iOS platform. Last summer, in my internship in SAP US Newtown Square, I was doing iOS development under Martin Lang. Martin always showed some cool apps to fellow colleagues, and two of them are very impressive in my opinion. One is called Layar, which shows the names and details information of the building around you, in your camera scene

Mzl pddncsgw 320x480 75

The other app is called WordLens. This app detects text information from the camera in real time and translate it, which is really amazing

Wordlens

However, none of these apps are open source, and they're not using GPU to do the calculation. So here my idea comes around. I wanted to do an Augmented Reality app on the iOS, and leveraging the GPU at the same time.

Apple had done such thing in their latest iOS 4.3. When apple introduced iPad 2, they brought Photo Booth from Mac to iOS.

20110302-10373721-img4617.jpg

Photo Booth uses Core Image API to process the video frames, which have GLSL underneath. So, GPU powered video processing is doable on iOS.

About what I want to implement specifically, I read this paper from CVPR, and the algorithm introduced inside is cool and efficient. Therefore I want to do it on the iOS, which use the GPU to process the video frames, and fetching texts from within the frames. Ideally if I can speak out in iOS with a TTS(Text-to Speech) engine, that would be perfect, but right now I have no clue whether that's possible.

So my plan for the final project is listed below:

1. A simple app on the iPhone just to capture the video, and get the video frames.

2. Implement the algorithms from the paper, get the texts within frames

3. Speed up Step 2 with GLSL/Core Image to get better frame rates, ideally real time performance

And I think my project might be very useful in the future, like bind people can use their iPhones to read information from public places, and it's fun to play with for a normal person

That's it. Also suggested by the final project write-up, I use the twitter account nlyrics2 for updating news. Wish myself good-luck in future development.