AWS Java SDK @ scale

27
AWS JAVA SDK @ SCALE BASED MOSTLY ON EXPERIENCES WITH S3 image source: http://xkcd.com/

Transcript of AWS Java SDK @ scale

Page 1: AWS Java SDK @ scale

A W S J AVA S D K @ S C A L EB A S E D M O S T LY O N E X P E R I E N C E S W I T H S 3

image source: http://xkcd.com/

Page 2: AWS Java SDK @ scale

C R E D E N T I A L SO U R

Page 3: AWS Java SDK @ scale

E N D P O I N T S

• REST API for everyone

• Great documentation

• http://aws.amazon.com/documentation/

Page 4: AWS Java SDK @ scale

A W S J AVA S D K

• One monolithic jar before 1.9.0

• Currently split into ~48 smaller modules dedicated to individual Amazon services

• All depend on aws-java-sdk-core module

• Other runtime dependencies:

• commons-logging

• apache http client (4.3.4)

• joda time

Page 5: AWS Java SDK @ scale

C R E D E N T I A L S

• Manually provide accessKey and secretKey (generated by IAM)

• Manual key management

• No automatic rotation

• Leaked keys will loose you serious $$$

new AmazonS3Client(new BasicAWSCredentials(accessKey, secretKey));

Page 6: AWS Java SDK @ scale

C R E D E N T I A L S

“I only had S3 keys on my GitHub and they where gone within 5 minutes!

Turns out through the S3 API you can actually spin up EC2 instances, and my key had been spotted by a bot that continually searches GitHub for API keys. Amazon AWS customer support informed me this happens a lot recently, hackers have created an algorithm that searches GitHub 24 hours per day for API keys. Once it finds one it spins up max instances of EC2 servers to farm itself bitcoins.

Boom! A $2375 bill in the morning.”

http://www.devfactor.net/2014/12/30/2375-amazon-mistake/

Page 7: AWS Java SDK @ scale

C R E D E N T I A L S

• Use credentials provider

• Default behaviour when zero argument constructor is invoked

• EnvironmentVariableCredentialsProvider SystemPropertiesCredentialsProvider ProfileCredentialsProvider InstanceProfileCredentialsProvider

• All but last one share security problems with manual access/secret keys management

new AmazonS3Client();

Page 8: AWS Java SDK @ scale

C R E D E N T I A L S

• Use InstanceProfileCredentialsProvider

• Needs IAM role of the server to be configured with permissions needed by the service using this provider.

• Calls EC2 Instance Metadata Service to get current security credentials.

• http://169.254.169.254/latest/meta-data/iam/security-credentials/

• Automatic management and rotation of keys.

• Stored only in memory of calling process

Page 9: AWS Java SDK @ scale

C R E D E N T I A L S

• Use InstanceProfileCredentialsProvider

• Credentials are reloaded under lock which may cause latency spikes (every hour).

• Instantiate with refreshCredentialsAsync == true

• Problems when starting on developers machines

• Use AdRoll’s hologram to create fake environment locally

• https://github.com/AdRoll/hologram

Page 10: AWS Java SDK @ scale

B U I LT I N M O N I T O R I N G

amazonS3Client.addRequestHandler(new RequestHandler2() { @Override public void beforeRequest(Request<?> request) { } @Override public void afterResponse(Request<?> request, Response<?> response) { request.getAWSRequestMetrics()... } @Override public void afterError(Request<?> request, Response<?> response, Exception e) { }});

Page 11: AWS Java SDK @ scale

B U I LT I N M O N I T O R I N G

AmazonS3Client amazonS3 = new AmazonS3Client( new StaticCredentialsProvider(credentials), new ClientConfiguration(), new RequestMetricCollector() { @Override public void collectMetrics(Request<?> request, Response<?> response) { }}

);

Page 12: AWS Java SDK @ scale

T E S T I N G W I T H S 3

• Use buckets located close to testing site

• Use fake S3 process:

• https://github.com/jubos/fake-s3

• https://github.com/tkowalcz/fake-s3

• same thing but with few bug fixes

• Not scalable enough

• Write your own :(

• Not that hard//lookout for issue 414 amazonS3.setEndpoint(“http://localhost...");

Page 13: AWS Java SDK @ scale

S C A R Y S T U F F

• #333 SDK can't list bucket nor delete S3 object with characters in range [0x00 - 0x1F] #333

• According to the S3 objects naming scheme, [0x00 - 0x1F] are valid characters for the S3 object. However, it's not possible to list bucket with such objects using the SDK (XML parser chokes on them) and also, they can't be deleted thru multi objects delete (also XML failure). What is interesting, download works just fine.

• #797 S3 delete_objects silently fails with object names containing characters in the 0x00-0x1F range

• Bulk delete over 1024 objects will fail with unrelated exception

Page 14: AWS Java SDK @ scale

“ A S Y N C H R O N O U S ” V E R S I O N S

• There is no truly asynchronous mode in AWS SDK

• Async versions of clients use synchronous blocking http calls but wrap them in a thread pool

• S3 has TransferManager (we have no experience here)

Page 15: AWS Java SDK @ scale

B A S I C S 3 P E R F O R M A N C E T I P S

• Pseudo random key prefix allows splitting files among S3 “partitions” evenly

• Listing is usually the bottleneck. Cache list results.

• Or write your own microservice to eliminate lists

Page 16: AWS Java SDK @ scale

S D K P E R F O R M A N C E

• Creates tons of short lived objects

• Many locks guarding internal state

• Profiled with Java Mission Control (if it does not crash)

• Or Yourkit

• Then test on production data

Page 17: AWS Java SDK @ scale
Page 18: AWS Java SDK @ scale

public XmlResponsesSaxParser() throws AmazonClientException { // Ensure we can load the XML Reader. try { xr = XMLReaderFactory.createXMLReader(); } catch (SAXException e) { throw new AmazonClientException("Couldn't initialize a SAX driver to create an XMLReader", e); } }

Page 19: AWS Java SDK @ scale

@Overrideprotected final CloseableHttpResponse doExecute(final HttpHost target, final HttpRequest request, final HttpContext context) throws IOException, ClientProtocolException { Args.notNull(request, "HTTP request"); // a null target may be acceptable, this depends on the route planner // a null context is acceptable, default context created below HttpContext execContext = null; RequestDirector director = null; HttpRoutePlanner routePlanner = null; ConnectionBackoffStrategy connectionBackoffStrategy = null; BackoffManager backoffManager = null; // Initialize the request execution context making copies of // all shared objects that are potentially threading unsafe. synchronized (this) {

Page 20: AWS Java SDK @ scale

public synchronized final ClientConnectionManager getConnectionManager() { if (connManager == null) { connManager = createClientConnectionManager(); } return connManager; } public synchronized final HttpRequestExecutor getRequestExecutor() { if (requestExec == null) { requestExec = createRequestExecutor(); } return requestExec; } public synchronized final AuthSchemeRegistry getAuthSchemes() { if (supportedAuthSchemes == null) { supportedAuthSchemes = createAuthSchemeRegistry(); } return supportedAuthSchemes; } public synchronized void setAuthSchemes(final AuthSchemeRegistry registry) { supportedAuthSchemes = registry; } public synchronized final ConnectionBackoffStrategy getConnectionBackoffStrategy() { return connectionBackoffStrategy; }

Page 21: AWS Java SDK @ scale

O L D A PA C H E H T T P C L I E N T ( 4 . 3 . 4 )

• Riddled with locks

• Reusing same client can save resources but at cost of performance

• different code paths may not target same sites

• open sockets are not that costly

• better use many client instances (e.g. per-thread)

• Make sure number of threads using one client instance it is less than maximum number of connections in its pool

• severe contention on returning connections to pool

• recent versions got better

Page 22: AWS Java SDK @ scale

B A S I C C O N F I G U R AT I O N

<bean id=“...” class="com.amazonaws.services.s3.AmazonS3Client" scope="prototype"> <constructor-arg> <bean class="com.amazonaws.ClientConfiguration"> <property name="maxConnections"

value="#{T(Integer).parseInt('${storage.readingThreads}') * 2}”/> <property name="protocol" value="HTTP"/> </bean> </constructor-arg></bean>

Page 23: AWS Java SDK @ scale

C L I E N T P O O L

<bean id="poolTargetSource" class="pl.codewise.voluum.util.AmazonS3ClientPool"> <property name="targetBeanName" value="amazonS3Client"/> <property name="maxSize" value="10"/> </bean><bean id="amazonS3Client" class="org.springframework.aop.framework.ProxyFactoryBean" primary="true"> <property name="targetSource" ref="poolTargetSource"/> <property name="interfaces"> <list> <value>com.amazonaws.services.s3.AmazonS3</value> </list> </property></bean>

int index = ThreadLocalRandom.current().nextInt(getMaxSize()); return clients[index];

Page 24: AWS Java SDK @ scale

W H AT T O D O W I T H T H I S ?

• Hardcore approach (classpath overrides of following classes)

• Our own AbstractAWSSigner that uses third party, lock free HmacSHA1 signing algorithm

• ResponseMetadataCache without locks (send metadata to /dev/null)

• AmazonHttpClient to remove call to System.getProperty

• DateUtils using joda time (now fixed in SDK itself)

Page 25: AWS Java SDK @ scale

D s t a t o u t p u t . U s e r m o d e c p u u s a g e m o s t l y r e l a t e d t o d a t a p r o c e s s i n g .

P E R F O R M A N C E A C H I E V E D

CPU (user, system, idle) Network transfer (IN/OUT) IRQ/CNTX

Page 26: AWS Java SDK @ scale

O P T I M I S AT I O N S R E S U LT

com.amazonaws.services.s3.model.AmazonS3Exception:

Please reduce your request rate.

(Service: Amazon S3; Status Code: 503; Error Code: SlowDown)

Page 27: AWS Java SDK @ scale

– H E N R Y P E T R O S K I

"The most amazing achievement of the computer software industry is its continuing cancellation of

the steady and staggering gains made by the computer hardware industry."