Stupid Video Tricks, CocoaConf Seattle 2014

Post on 22-Apr-2015

400 views 8 download

description

AV Foundation makes it reasonably straightforward to capture video from the camera and edit together a nice family video. This session is not about that stuff. This session is about the nooks and crannies where AV Foundation exposes what's behind the curtain. Instead of letting AVPlayer read our video files, we can grab the samples ourselves and mess with them. AVCaptureVideoPreviewLayer, meet the CGAffineTransform. And instead of dutifully passing our captured video frames to the preview layer and an output file, how about if we instead run them through a series of Core Image filters? Record your own screen? Oh yeah, we can AVAssetWriter that. With a few pointers, a little experimentation, and a healthy disregard for safe coding practices, Core Media and Core Video let you get away with some neat stuff

Transcript of Stupid Video Tricks, CocoaConf Seattle 2014

Stupid Video TricksChris Adamson • @invalidname

CocoaConf Seattle • October, 2014

AV Foundation

• Framework for working with time-based media

• Audio, video, timed text (captions / subtitles), timecode

• iOS 4.0 and up, Mac OS X 10.7 (Lion) and up

• Replacing QuickTime on Mac

Ordinary AV Foundation stuff

• Playback: AVAsset + AVPlayer

• Capture to file: AVCaptureSession + AVCaptureDeviceInput + AVCaptureMovieFileOutput

• Editing: AVComposition + AVExportSession

But Why Be Ordinary?

http://www.crunchyroll.com/my-ordinary-life

Introductory Trick

• AVPlayerLayer and AVCaptureVideoPreviewLayer are subclasses of CALayer

• We can do lots of neat things with CALayers

Demo

Make an AVPlayerLayer

self.player = [AVPlayer playerWithPlayerItem: [AVPlayerItem playerItemWithAsset:self.asset]]; self.playerLayer = [AVPlayerLayer playerLayerWithPlayer:self.player]; self.playerLayer.videoGravity = AVLayerVideoGravityResizeAspect; [self.playerView.layer addSublayer:self.playerLayer]; self.playerLayer.frame = self.playerView.bounds;

Animate the layer

CABasicAnimation *animateRotation; animateRotation = [CABasicAnimation animationWithKeyPath:@"transform"]; CATransform3D layerTransform = self.playerLayer.transform; layerTransform = CATransform3DRotate(layerTransform, M_PI, 1.0, 1.0, 0.0); animateRotation.toValue = [NSValue valueWithCATransform3D:layerTransform]; animateRotation.duration = 2.0; animateRotation.removedOnCompletion = NO; animateRotation.fillMode = kCAFillModeForwards; animateRotation.repeatCount = 1000; animateRotation.cumulative = YES; [self.playerLayer addAnimation:animateRotation forKey:@"transform"];

Going Deeper

http://www.crunchyroll.com/angel-beats

AV Foundation

Core Audio Core Media

Video Toolbox

Core Video

AV Foundation

Core Audio Core Media

Video Toolbox

Core Video

AV Foundation

Core Audio Core Media

Video Toolbox

Core Video

Core Media

Core Media

• Opaque types to represent time: CMTime, CMTimeRange

• Opaque types to represent media samples and their contents: CMSampleBuffer, CMBlockBuffer, CMFormatDescription

CMSampleBuffer

• Provides timing information for one or more samples: when does this play and for how long?

• Contains either

• CVImageBuffer – visual data (video frames)

• CMBlockBuffer — arbitrary data (sound, subtitles, timecodes)

Use & Abuse of CMSampleBuffers

• AVCaptureDataOutput provides CMSampleBuffers in sample delegate callback

• AVAssetReader provides CMSampleBuffers read from disk

• AVAssetWriter accepts CMSampleBuffers to write to disk

Demo

How the Heck Does that Work?

• Movies have tracks, tracks have media, media have sample data

• All contents of a QuickTime file are defined in the QuickTime File Format documentation

Subtitle Sample DataSubtitle sample data consists of a 16-‐bit word that specifies the length (number of bytes) of the subtitle text,followed by the subtitle text and then by optional sample extensions. The subtitle text is Unicode text, encodedeither as UTF-‐8 text or UTF-‐16 text beginning with a UTF-‐16 BYTE ORDER MARK ('\uFEFF') in big or little endianorder. There is no null termination for the text.

Following the subtitle text, there may be one or more atoms containing additional information for selectingand drawing the subtitle.

Table 4-‐12 (page 203) lists the currently defined subtitle sample extensions.

Table 4-12 Subtitle sample extensions

DescriptionSubtitle

sample

extension

The presence of this atom indicates that the sample contains a forced subtitle. Thisextension has no data.

Forced subtitles are shown automatically when appropriate without any interactionfrom the user. If any sample contains a forced subtitle, the Some Samples Are Forced(0x40000000) flag must also be set in the display flags.

Consider an example where the primary language of the content is English, but theuser has chosen to listen to a French dub of the audio. If a scene in the video displayssomething in English that is important to the plot or the content (such as a newspaperheadline), a forced subtitle displays the content translated into French. In this case, thesubtitle is linked (“forced”) to the French language sound track.

If this atom is not present, the subtitle is typically simply a translation of the audiocontent, which a user can choose to display or hide.

'frcd'

Style information for the subtitle. This atom allows you to override the default style inthe sample description or to define more than one style within a sample. See “SubtitleStyle Atom” (page 204).

'styl'

Override of the default text box for this sample. Used only if the 0x20000000 displayflag is set in the sample description and, in that case, only the top is considered. Evenso, all fields should be set as though they are considered. See “Text Box atom” (page205).

'tbox'

Text wrap. Set the one-‐byte payload to 0x00 for no wrapping or 0x01 for automaticsoft wrapping.

'twrp'

Media Data Atom TypesSubtitle Media

2014-‐02-‐11 | Copyright © 2004, 2014 Apple Inc. All Rights Reserved.

203

Subtitle Sample DataSubtitle sample data consists of a 16-bit word that specifies the length (number of bytes) of the subtitle text, followed by the subtitle text and then by optional sample extensions. The subtitle text is Unicode text, encoded either as UTF-8 text or UTF-16 text beginning with a UTF-16 BYTE ORDER MARK ('\uFEFF') in big or little endian order. There is no null termination for the text.Following the subtitle text, there may be one or more atoms containing additional information for selecting and drawing the subtitle.

I Iz In Ur Subtitle Track…AVAssetTrack *subtitleTrack = [[asset tracksWithMediaType: AVMediaTypeSubtitle] firstObject];

AVAssetReaderTrackOutput *subtitleTrackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:subtitleTrack outputSettings:nil];

// ... while (reading) { CMSampleBufferRef sampleBuffer = [subtitleTrackOutput copyNextSampleBuffer]; if (sampleBuffer == NULL) { AVAssetReaderStatus status = subtitleReader.status; if ((status == AVAssetReaderStatusCompleted) || (status == AVAssetReaderStatusFailed) || (status == AVAssetReaderStatusCancelled)) { reading = NO; NSLog (@"ending with reader status %d", status); } } else { CMTime presentationTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer) ; CMTime duration = CMSampleBufferGetDuration(sampleBuffer);

…Readin Ur CMBlockBuffers

CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer); size_t dataSize =CMBlockBufferGetDataLength(blockBuffer); if (dataSize > 0) { UInt8* data = malloc(dataSize); OSStatus cmErr = CMBlockBufferCopyDataBytes (blockBuffer, 0, dataSize, data);

Subtitle Summary

• AVAssetReaderOutput provides CMSampleBuffers

• Get timing info with CMSampleBufferGetPresentationTimestamp() and CMSampleGetDuration()

• Get raw data with CMBlockBufferGet…() functions

• Have at it

Writing Samples

Demo

Screen Recording

• Run an NSTimer to get screenshots

• Many ways to do this, such as drawing your CALayer to a CGContext, make it a UIImage

• Convert image data into a CVPixelBuffer

• Use AVAssetWriterPixelBufferAdaptor to write pixel buffer and presentation time to an AVAssetWriterInput

Make CVPixelBufferCVPixelBufferRef pixelBuffer = NULL;CFDataRef imageData= CGDataProviderCopyData(CGImageGetDataProvider(image));cvErr = CVPixelBufferCreateWithBytes(kCFAllocatorDefault,

FRAME_WIDTH, FRAME_HEIGHT, kCVPixelFormatType_32BGRA, (void*)CFDataGetBytePtr(imageData), CGImageGetBytesPerRow(image), NULL, NULL, NULL, &pixelBuffer);

Compute Presentation Time

CFAbsoluteTime thisFrameWallClockTime = CFAbsoluteTimeGetCurrent();CFTimeInterval elapsedTime = thisFrameWallClockTime - self.firstFrameWallClockTime;CMTime presentationTime = CMTimeMake (elapsedTime * TIME_SCALE, TIME_SCALE);

Append to AVAssetWriterInput

BOOL appended = [self.assetWriterPixelBufferAdaptor appendPixelBuffer:pixelBuffer withPresentationTime:presentationTime];

Audio

http://indignation.deviantart.com/art/Hatsune-Miku-headphones-254724145

AV Foundation

Core Audio Core Media

Video Toolbox

Core Video

Core Audio Core Media

• Reading: CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(), CMSampleBufferGetAudioStreamPacketDescriptions()

• Writing: CMSampleBufferSetDataBufferFromAudioBufferList(), CMAudioSampleBufferCreateWithPacketDescriptions()

Core Audio Core Media

• Reading: CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(), CMSampleBufferGetAudioStreamPacketDescriptions()

• Writing: CMSampleBufferSetDataBufferFromAudioBufferList(), CMAudioSampleBufferCreateWithPacketDescriptions()

Core Audio Core Media

Potential Uses

• Run captured / read-in audio through effects in an AUGraph

• See “AVCaptureAudioDataOutput To AudioUnit” examples (iOS & OS X) from WWDC 2012

• May make more sense for audio-oriented apps to do capture / file reads entirely from Core Audio

Video

http://www.crunchyroll.com/angel-beats

AV Foundation

Core Audio Core Media

Video Toolbox

Core Video

Core Media Core Video

• CMSampleBuffers provide CVImageBuffers

• Two sub-types: CVPixelBufferRef, CVOpenGLESTextureRef

• Pixel buffers allow us to work with bitmaps, via CVPixelBufferGetBaseAddress()

• Note: Must wrap calls with CVPixelBufferLockBaseAddress(), CVPixelBufferUnlockBaseAddress()

Core Media Core Video

Use & Abuse of CVImageBuffers

• Can be used to create Core Image CIImages

• iOS: -[CIImage imageWithCVPixelBuffer:]

• OS X: -[CIImage imageWithCVImageBuffer:]

• CIImages can be used to do lots of stuff…

Demo

Recipe

• Create CIContext from EAGLContext

• Create CIFilter

• During capture callback

• Convert pixel buffer to CIImage

• Run through filter

• Draw to CIContext

Set Up GLKView

if (! self.glkView.context.API != kEAGLRenderingAPIOpenGLES2) { EAGLContext *eagl2Context = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES2]; self.glkView.context = eagl2Context; } self.glContext = self.glkView.context; // we'll do the updating, thanks self.glkView.enableSetNeedsDisplay = NO;

Make CIContext

// make CIContext from GL context, clearing out default color space self.ciContext = [CIContext contextWithEAGLContext:self.glContext options:@{kCIContextWorkingColorSpace : [NSNull null]} ]; [self.glkView bindDrawable]; // from Core Image Fun House: _glkDrawBounds = CGRectZero; _glkDrawBounds.size.width = self.glkView.drawableWidth; _glkDrawBounds.size.height = self.glkView.drawableHeight;

See also iOS Core Image Fun House from WWDC 2013

Ask for RGB in Callbacks

self.videoDataOutput = [[AVCaptureVideoDataOutput alloc] init]; [self.videoDataOutput setSampleBufferDelegate:self queue:self.videoDataOutputQueue]; [self.captureSession addOutput: self.videoDataOutput]; NSDictionary *videoSettings = @{ (id) kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_32BGRA)}; [self.videoDataOutput setVideoSettings:videoSettings];

Note: 32BGRA and two flavors of 4:2:0 YCbCr are the only valid pixel formats for video capture on iOS

Create CIFilter

self.pixellateFilter = [CIFilter filterWithName:@"CIPixellate"]; [self.pixellateFilter setValue:[CIVector vectorWithX:100.0 Y:100.0] forKey:@"inputCenter"]; [self setPixellateFilterScale:self.pixellationScaleSlider.value];

Set CIFilter Params

- (IBAction)handleScaleSliderValueChanged:(UISlider*) sender { [self setPixellateFilterScale:sender.value]; }

-(void) setPixellateFilterScale: (CGFloat) scale { [self.pixellateFilter setValue:[NSNumber numberWithFloat:scale] forKey:@"inputScale"]; }

Callback: Apply Filter

CVImageBufferRef cvBuffer = CMSampleBufferGetImageBuffer(sampleBuffer); CVPixelBufferLockBaseAddress(cvBuffer,0); CIImage *bufferCIImage = [CIImage imageWithCVPixelBuffer:cvBuffer];

[self.pixellateFilter setValue:bufferCIImage forKey:kCIInputImageKey]; bufferCIImage = [self.pixellateFilter valueForKey:kCIOutputImageKey];

Callback: Draw to GLKView[self.glkView bindDrawable]; if (self.glContext != [EAGLContext currentContext]) { [EAGLContext setCurrentContext: self.glContext]; } // drawing code here is from WWDC 2013 iOS Core Image Fun House // set the blend mode to "source over" so that CI will use that glEnable(GL_BLEND); glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA); CGRect drawRect = bufferCIImage.extent; [self.ciContext drawImage:bufferCIImage inRect:self.glkDrawBounds fromRect:drawRect]; [self.glkView display]; CVPixelBufferUnlockBaseAddress(cvBuffer, 0);

Recap

CVPixelBuffer CIImage CIImage

from camera CIFilter

CIContext

-[CIContext drawImage: fromRect: inRect:]

(OpenGL drawing)

+[CIImage imageWith

CVPixelBuffer:]

Recap

CVPixelBuffer CIImage CIImage

from camera CIFilter

CIContext

-[CIContext drawImage: fromRect: inRect:]

CVPixelBuffer(optional)

AVAssetWriter Output(OpenGL

drawing)

+[CIImage imageWith

CVPixelBuffer:]

By the way…

There are lots of CIFiltersCICategoryBlur

CIBoxBlur CIDiscBlur CIGaussianBlur CIMedianFilter CIMotionBlur CINoiseReduction CIZoomBlur

CICategoryColorAdjustment

CIColorClamp CIColorControls CIColorMatrix CIColorPolynomial CIExposureAdjust CIGammaAdjust CIHueAdjust

CILinearToSRGBToneCurve

CISRGBToneCurveToLinearCITemperatureAndTint CIToneCurve CIVibrance CIWhitePointAdjust

CICategoryColorEffectCIColorCrossPolynomial CIColorCube

CIColorCubeWithColorSpace

CIColorInvert CIColorMap CIColorMonochrome CIColorPosterize CIFalseColor CIMaskToAlpha CIMaximumComponent CIMinimumComponent CIPhotoEffectChrome CIPhotoEffectFade CIPhotoEffectInstant CIPhotoEffectMono CIPhotoEffectNoir CIPhotoEffectProcess CIPhotoEffectTonal CIPhotoEffectTransfer CISepiaTone CIVignette CIVignetteEffect

CICategoryCompositeOperation

CIAdditionCompositing CIColorBlendMode CIColorBurnBlendMode CIColorDodgeBlendMode CIDarkenBlendMode CIDifferenceBlendMode CIExclusionBlendMode CIHardLightBlendMode CIHueBlendMode CILightenBlendMode CILuminosityBlendMode CIMaximumCompositing CIMinimumCompositing CIMultiplyBlendMode

CIMultiplyCompositing CIOverlayBlendMode CISaturationBlendMode CIScreenBlendMode CISoftLightBlendMode

CISourceAtopCompositingCISourceInCompositing CISourceOutCompositing

CISourceOverCompositing

CICategoryDistortionEffect

CIBumpDistortion CIBumpDistortionLinear

CICircleSplashDistortionCICircularWrap CIDroste

CIDisplacementDistortionCIGlassDistortion CIGlassLozenge CIHoleDistortion CILightTunnel CIPinchDistortion CIStretchCrop CITorusLensDistortion CITwirlDistortion CIVortexDistortion

CICategoryGeneratorCICheckerboardGenerator

CIConstantColorGenerator

CILenticularHaloGeneratorCIQRCodeGenerator CIRandomGenerator CIStarShineGenerator CIStripesGenerator CISunbeamsGenerator

CICategoryGeometryAdjustment

CIAffineTransform CICrop

CILanczosScaleTransformCIPerspectiveTransform

CIPerspectiveTransformWithExtentCIStraightenFilter

CICategoryGradientCIGaussianGradient CILinearGradient CIRadialGradient CISmoothLinearGradient

CICategoryHalftoneEffect

Demo

CIColorCube

Maps colors from one RGB “cube” to another

http://en.wikipedia.org/wiki/RGB_color_space

Using CIColorCube

CIColorCube maps green(-ish) colors to 0.0 alpha, all other colors pass through

CISourceOverCompositing

inputBackgroundImage inputImage

outputImage

CIColorCube Data const unsigned int size = 64; size_t cubeDataSize = size * size * size * sizeof (float) * 4; float *keyCubeData = (float *)malloc (cubeDataSize); float rgb[3], hsv[3], *keyC = keyCubeData; // Populate cube with a simple gradient going from 0 to 1 for (int z = 0; z < size; z++){ rgb[2] = ((double)z)/(size-1); // Blue value for (int y = 0; y < size; y++){ rgb[1] = ((double)y)/(size-1); // Green value for (int x = 0; x < size; x ++){ rgb[0] = ((double)x)/(size-1); // Red value

// Convert RGB to HSV // You can find publicly available rgbToHSV functions on the Internet

RGBtoHSV(rgb[0], rgb[1], rgb[2], &hsv[0], &hsv[1], &hsv[2]);

// RGBtoHSV uses 0 to 360 for hue, while UIColor (used above) uses 0 to 1. hsv[0] /= 360.0; // Use the hue value to determine which to make transparent // The minimum and maximum hue angle depends on // the color you want to remove bool keyed = (hsv[0] > minHueAngle && hsv[0] < maxHueAngle) && (hsv[1] > minSaturation && hsv[1] < maxSaturation) && (hsv[2] > minBrightness && hsv[2] < maxBrightness); float alpha = keyed ? 0.0f : 1.0f; // re-calculate c pointer keyC = (((z * size * size) + (y * size) + x) * sizeof(float)) + keyCubeData; // Calculate premultiplied alpha values for the cube keyC[0] = rgb[0] * alpha; keyC[1] = rgb[1] * alpha; keyC[2] = rgb[2] * alpha; keyC[3] = alpha; } } }

See “Chroma Key Filter Recipe” in Core Image Programming Guide

Create CIColorCube from mapping data

// Create memory with the cube data NSData *data = [NSData dataWithBytesNoCopy:keyCubeData length:cubeDataSize freeWhenDone:YES]; self.colorCubeFilter = [CIFilter filterWithName:@"CIColorCube"]; [self.colorCubeFilter setValue:[NSNumber numberWithInt:size] forKey:@"inputCubeDimension"]; // Set data for cube [self.colorCubeFilter setValue:data forKey:@"inputCubeData"];

Create CISourceOverCompositing

// source over filter self.sourceOverFilter = [CIFilter filterWithName: @"CISourceOverCompositing"]; CIImage *backgroundCIImage = [CIImage imageWithCGImage: self.backgroundImage.CGImage]; [self.sourceOverFilter setValue:backgroundCIImage forKeyPath:@"inputBackgroundImage"];

Apply Filters in Delegate Callback

CIImage *bufferCIImage = [CIImage imageWithCVPixelBuffer: cvBuffer]; [self.colorCubeFilter setValue:bufferCIImage forKey:kCIInputImageKey]; CIImage *keyedCameraImage = [self.colorCubeFilter valueForKey: kCIOutputImageKey]; [self.sourceOverFilter setValue:keyedCameraImage forKeyPath:kCIInputImageKey]; CIImage *compositedImage = [self.sourceOverFilter valueForKeyPath: kCIOutputImageKey];

Then draw compositedImage to CIContext as before

More Fun with Filters

• Alpha Matte: Use CIColorCube to map green to white (or transparent), everything else to black

• Can then use this with other filters to do edge work on the “foreground” object

• Be sure that any filters you use are of category CICategoryVideo.

More Fun With CIContexts

• Can write effected pixels to a movie file with AVAssetWriterOutput

• Use base address of CIContext to create a new CVPixelBuffer, use this and timing information to create a CMSampleBuffer

• AVAssetWriterInputPixelBufferAdaptor makes this slightly easier

Recap

• Most good tricks start with CMSampleBuffers

• Audio: convert to Core Audio types

• Video: convert to CIImage

• Other: get CMBlockBuffer and parse by hand

Further Info

and http://devforums.apple.com/

Q&ASlides at http://www.slideshare.net/invalidname/

See comments there for link to source code

invalidname [at] gmail.com @invalidname (Twitter, app.net) http://www.subfurther.com/blog