    Analyst Interview Questions


    3 significant areas to cover. Assumption is that questions in these areas will provide datato assess leadership, culture fit, & communication skills

    irst! Business "rocess Assessment The candidate should #e a#le to assesspro#lem$opportunities in a %case% study method.

    econd! Technical 'epth The candidate will need to retrieve, manipulate, & evaluate

    large sets of data efficiently

    Third! elf(directed$)eadership *ill this candidate look for #usiness opportunities withpassion+

    . Business "rocess Assessment -uick -uestions

    *hat are you reading currently+

    *hat has influenced your #usiness #ehavior most heavily in the last year+

    *hat kind of process or proect mgmt training have you had+

    *hat do you think of /five forces, rational, 01), competitive advantage, ted levitt2scriticism of the %product lifecycle%, i igma4

    How do you get to root cause for an issue such as

    )ooking for! 5vidence that the person is growing, stretching in a direction that is

    successful at Ama6on. 7s the candidate reading the learning organi6ation, the innovators

    dilemma or who moved my cheese. 'o they read the 5conomist, 1it Tech review, HB8or 9ewsweek+ 'o they recogni6e process terms+

    )onger -uestions These should #e of : types ( simple give me an equation for situation

    and then more vague case types.

    imple 5quations

    Profitability = Revenue - CostsNeed Inventory = OH -Demand (bonus if time phased & inludes foreasts!intransits"Predited OH Inventory = OH #Intransits # Pos - Demand -$oreastHealthy Inventory =
    tart ust asking for a simple definition, then start discussing factors with the candidate.

    or eample in profita#ility, what factors go into costs+

    )ooking for thoughtfulness & testing of assumptions. How does the candidate thinkthrough the question ( systematically or ad(hoc


    kip suggested to preface questions of this type with %there is no right answer, 7 want to

    use this as an eample to see how you approach a pro#lem%

    :. Technical Assessment -uestions -uick questions!

    rom a list of orders over the last week using the tool of your choice

    . rank the orders #y quantity

    :. avg quantity for each vendor

    3. ? of distinct vendors per week.

    @. ind & count lines in a log file that have a specific A79or user id

    /7n an onsite, could have a data file on a laptop and say show me+.4

    )ooking for!

    0ni ( cut cat find grep sort

    5cel$Access ! #asic functions, pivot ta#les, data structures, domain ta#les

    -) ( nested queries, functions, #asic oins "erl ( any scripting+, 8eg5p

    8eference ( does the candidate know how to find help, admit #oundaries

    )onger -)$0ni -uestions

    . 7 need to provide a report with!

    :. (the total units & the average cost of #ook orders #y day of week over the last weeks #y country

    % ood ans'er 'ill loo somethin lie)elet sum(*uantity"! av(ost"! produt! to+har(date!,D,"!

    ountryfrom (
    selet *uantity! (*uantity . ost" as ost!to+har(order+day!,dy,"! produt! ountry

    from order+items'here produt = boosand order+day bet'een / & no' "

    0he andidate should reoni1e-ost must be alulated on an item basis before averain -

    nested or inline *uery-s*l funtions e/ist for total! averae! & date manipulation

    or etra credit add a oin ( such as #ook name

    . %ay there2s a tet file of the form %userid(ta#(command% that tracks all thecommands that a given user runs. How would you find out how many times user

    %Bo#% has run any command at all+%

    A ood Answer! At its most #asic!

    2rep - 3ob filename2 or 2at filename 4 rep 3ob25

    7f they understand that %Bo#% could #e part of the command, then the correct grep is


    rep - 263ob2

    to anchor the user. 5ven #etter, so in case there2s a user called %Bo#% and another called

    %Bo#H%, they should do!

    rep - 263ob7tab825

    3. elf(directed$)eadership Assessing these #ehaviors may occur throughout thequestions of other areas

    -9oo for an opportunity to hallene the andidate on somethin that isobviously riht or true5

    -Do they hold their round:-Do they et anry or loo to understand 'hy they,re bein

    hallened:-;hat does the resume say about the andidate:

    -did they found anythin! start anythin! volunteer on somethinhue!

    7s this person an Auto'idact+

    Hiring Analysts

    ;ompanies that hire lots of analysts have the process down to a science, ust as Ama6on

    does for '5s. The interview process at a #ig C consulting firm is very defined,

    #ehaviorally focused, looking at capa#ilities. An Analyst is a unique creature #ut notimpossi#le to find & assess.

    'uring the interview loop in addition to culture fit & interpersonal skills, a candidateshould #e reviewed on how they2ve displayed analyst type competencies in the past A9'

    solve a pro#lem to display the competency in actuality. Analysts are usually goodpresenters, ust asking them a#out the past may not display the limits of their a#ilities.

    Here are a couple frameworks of core Analyst ;ompetencies!


    . Think #road and deep! can take the #ig picture strategic #usiness view and can also

    dive into the details to understand a pro#lem

    :. "ro#lem solving skills! can they structure and frame a pro#lem, make estimates when

    necessary, figure out the dataset needed /smallest, easiest dataset to draw solid

    conclusions4, get and analy6e the data, summari6e the conclusions and their reasoning

    3. ;ommunication skills! clear, organi6ed, concise, a#ility to adapt to audience /D"to'54, think on the fly, thoughtful

    @. 1ulti(tasking! can they uggle many issues at one time+

    C. 7ndependence! a#ility to work with minimal direction and ask for help when needed

    E. ;ustomer focus

    F. ;ultural fit! Team /;G,;G,...4 and Ama6on

    . )eadership


    Find, Frame, Analyze, & Deliver within Amazon

    Find Problems/Oortunities

    An analyst should #e a#le to recogni6e #roken processes, #ad processes, trou#leshoot

    processes. But also prioriti6e whether the proposal is polishing a pig or creating a golden

    cow. Building pretty toys with no8G7is a waste of time. iven the #usiness maturity atAma6on, there are a lot of process improvements or new #usinesses where money can #e

    "ast 5ample A candidate should #e a#le to point to past proects where they!

    -'ored as support-)aved / /el!P)oft'are"

    Have them eplain their role, 'rive into specifics

    "ro#lem olving "rovide a pro#lem for them to solve ( tweak it for ecommerce (*hy dosplit shipments matter+ (How would you #uild a forecasting model for new products with

    no history+ (*hat data does Ama6on have that is unique, how can this #e used in upply

    ;hain+ (How many customers does a :J damage rate to the top #est selling items atthe top @ ;simpact+
    This competency is a display of analytic skills ( does the candidate set assumptions,

    challenge the definitions, and display the a#ility to draft a reasona#le model+ ;ould they

    #uild a metrics package+

    "otential kills (1odeling! ;an a candidate draw out a forecast equation, linear

    programming (Advanced Business 1easures! Time Dalue of 1oney



    Gnce a candidate has #uilt a model, no(one is going to go get data for you. The tools onhand will #e limited or perhaps not availa#le. To succeed the analyst will need to identify

    and evaluate a data source, then get the data themselves or negotiating for '5time.

    ince '5time is money, this is usually the less preferred choice. The key elements here

    are a#ilities to! I8etrieve 'ata I5valuate 'ata -uality I'ata cale

    o an analyst has found a good opportunity, determined how to quantify it, #ut how will

    the control #e #uilt ongoing+

    "ast 5ample A candidate should #e a#le to eplain past proects where they!

    -built a tool or heavily onfiured soft'are-'hat 'ere the shortomins: ho' did they drive throuh their 'eanesses-;hat data atherin tools 'ere used! ho' bi 'as the data set

    Have them eplain their role, 'rive into specifics

    "ro#lem olving "rovide a pro#lem for them to solve ( tweak it for ecommerce (7f -)is a listed skill ask for a query that tests aggregation, functions,oins, & #usiness

    definitions i.e. *rite a query from a order items ta#le that results in average ? of orders,average cost of orders #y product line over the last C weeks

    A great candidate should question the assumptions ( why C wks, why average, why

    aggregated at all. ollow up with %*hat decisions could 7 do with that data+

    ('5design questions are good here too

    This competency is a display of technical skills & #usiness skills ( ;ould the candidate

    analy6e a data set with :million rows+ *hat conclusions do they draw from the results

    "otential kills (-) ('esign


    Gnce the analysis is completed, is it ust a report on a shelf+ *hat changed+ *ere cost

    reductions actually reali6ed+ *hat form did the analysis results take ( powerpoint, 3 ring

    #inder, email, whitepaper+ *ho saw them and what did they do+ 7s the candidate awareof good visuali6ation guidelines /Tufte, *. ;leveland4 or do they )GD5 powerpoint+ At

    Ama6on, Analysts often present their own results ( will the work stand up to scrutiny+

    "ast 5ample A candidate should #e a#le to eplain past proects where they!

    -presented results in detail! in ?@ minutes-Ho' did you et your points aross in your allotted ?A minutes ofe/eutive time:-;hat data presentation tools 'ere used:

    "ro#lem olving "rovide a pro#lem for them to solve ( tweak it for ecommerce (%Kou

    have C minutes tomorrow afternoon to report #ack to a D"a#out a question he asked

    you today regarding specific metric accuracy ( ;ould you prepare an outline of youranswer, what format would it #e in, how would you followup on your


    )ook for creativity

    "otential kills (;reativity (5ffective communication (a get it done attitude

    Data Engineer Interview Questions



    1 Sample Interview Questions for Data Engineering Candidates

    o 1.1 DW Concepts
    o 1. !uning

    o 1." SQ#

    o 1.$ %racle

    o 1.& E!#

    o 1.' #inu()*ni(

    o 1.+ !eradata

    o 1., Data -odeling

    o 1. /dditional Questions for DEIII 0#evel ' 2ar

    1..1 %racle

    1.. /rchitecture and design

    o 1.13 4eporting Speci5c Interview Questions

    What the advantages of star schema design

    1. /llows 6usiness entities to map directl7 with schema design forhighl7 optimi8ed performance when 9uer7ing.

    . It is widel7 supported 67 a num6er of 2I tools.

    ". It is the simplest data warehouse schema.

    Can 7ou provide the di:erent t7pes of slow changing dimensions 0!7peI; II; III. What are the

  • 7/23/2019 Analyst Interview Questions - AMAZON


    ". !7pe III SCD=s are dimensions where a limited amount of histor7is preserved 67 using seperate columns. =%riginal= or =?revious=columns for another column; are common was to trac< a limitednum6er of changes.

    What are the di@culties in implementing a !7pe II dimension ta6le

    o /ASB When new records are created to represent changes in adimension ta6le; the relationships 6etween the fact ta6les andcommon

    $. %rders ta6le F %rderGId; %rderGDate; StoreGId; Sales4epGId; !otal/mount; !otal Quantit7

    &. %rder Items ta6le F %rderGId; ItemGId; Quantit7; /mount

    1. ow would 7ou approach 6uilding the DW schema for the a6ovemodelJ

    . /ASB Star schema or SnowKa

    1. athering data from di:erent sources for transforming atdi:erent times

    . %#!?=s can 9uic

    o /ASB !he level of granularit7 in a fact ta6le refers to the detailand precision at which a fact is captured within a given conte(t.

    [edit][hide] !uning

    7f you have a poorly performing report$etl process, how would you investigate and tune it

    going all the way #ack to ta#le design.

    e(plain plans when tuning what do 7ou loo< for in an e(plain planthat screams red Kags.

    =what if 7ou didn=t have inde(es=

    What a6out partitioning...

    What a6out the oracle level Poin t7pes 0hash; nested loop and wheneach should 6e used

    Di:erent t7pes of Poins and when each should 6e used

    [edit][hide] SQ#

    '5 #ar is questions (3

    1. iven an orders 0orderGid;orderGda7 ta6le.. count0 of orders lastweeE40%4DE4 2N %4DE4GD/N /S 4*AAIAG!%!/#

    o L4%- %4DE4S

    o What are the di:erences 6etween aggregates and anal7ticfunctions.. and how does oracle handle them di:erentl7

    o /ASB /ggregate functions returns one result per each group of

    the result set. Where as anal7tical functions returns multipleresults per each group i.e. using anal7tical functions we ma7displa7 group results along with individual rows.

    iven an orders ta6le with orderGid; customerGid and orderGdate withthe sample data

    o %rderGid; CustomerGid; orderGdate

    o %1; C1; 31Han333

    o %; C; 31Han33

    o %"; C"; 31/pr33

    o %$; C$; 31/pr33"

    o %&; C$; 31Han33'

    o %';C1; 31-a733'

    ive SQ# for the list of customerGids who placed more than 1 order

    o SE#EC! Customer; C%*A!0%rderID L4%- %rders

    o 4%*? 2N Customer

    o />IA Count0%rderID 1

    ive the S9l for the list of customerGids who have placed at least 1order in 333 and at least 1 order in 33'.

    o SE#EC! Customer; C%*A!0%rderID L4%- %rders

    o 4%*? 2N Customer

    o />IA 00Count0%rderID 1 /AD !%GC/40orderGdate;=NNNN= 333 %4 0Count0%rderID 1 /AD!%GC/40orderGdate;=NNNN= 33'

    ?lease write a s9l which can generate the num6er of %rders for each7ear; 333 to 33'.

    o SE#EC!

    o C%*A!0DIS!IAC! C/SE WEA !%GC/40orderGdate;=NNNN= 333 !EA 1 E#SE 3 EAD /S 333

    o C%*A!0DIS!IAC! C/SE WEA !%GC/40orderGdate;=NNNN= 331 !EA 1 E#SE 3 EAD /S 331

    o C%*A!0DIS!IAC! C/SE WEA !%GC/40orderGdate;=NNNN= 33 !EA 1 E#SE 3 EAD /S 33

    o C%*A!0DIS!IAC! C/SE WEA !%GC/40orderGdate;=NNNN= 33" !EA 1 E#SE 3 EAD /S 33"

    o C%*A!0DIS!IAC! C/SE WEA !%GC/40orderGdate;=NNNN= 33$ !EA 1 E#SE 3 EAD /S 33$

    o C%*A!0DIS!IAC! C/SE WEA !%GC/40orderGdate;=NNNN= 33& !EA 1 E#SE 3 EAD /S 33&

    o C%*A!0DIS!IAC! C/SE WEA !%GC/40orderGdate;=NNNN= 33' !EA 1 E#SE 3 EAD /S 33'

  • 7/23/2019 Analyst Interview Questions - AMAZON


    o L4%- %4DE4S

    Displa7 the emplo7ee records who Poins the department 6efore theirmanagerJ

    o SE#EC! emp1.

    o L4%- E-?#%NEES emp1; E-?#%NEES emp

    o WE4E emp1.-/A/E4GID emp.E-?#%NEEGID

    o /AD emp1.E-?#%NEEGH%IAGD/!E T emp.E-?#%NEEGH%IAGD/!E

    Displa7 emplo7ee records getting more salar7 than the average salar7in their departmentJ

    o SE#EC!

    o DE?!; E-?#%NEE; S/#/4N; />0S/#/4N

    o L4%- E-?#%NEES

    o 4%*? 2N DE?!; E-?#%NEE; S/#/4N

    o />IA />0S/#/4N T S/#/4N

    Displa7 the highest paid emplo7ee in each department.

    o SE#EC!

    o DE?!; E-?#%NEE; S/#/4N

    o L4%- E-?#%NEES

    o 4%*? 2N DE?!; E-?#%NEE

    o />IA -/U0S/#/4N S/#/4N

    Displa7 the nd highest paid emplo7ee in each department.

    o SE#EC! DE?!; E-?#%NEE

    o L4%-

    o 0SE#EC! DE?!; E-?#%NEE; 4/AV0 %>E4 0?/4!I!I%A 2N DE?!%4DE4 2N S/#/4N DESC /S 4/AV L4%- E-?#%NEES

    o WE4E 4/AV

    Select studentGid; studentGname from students where studentGid 1and studentGid . What does the 9uer7 returnJ

    o /ASB It returns nothing since studentGid is generall7 considereda uni9ue value and a student can=t have two IDs at once.

    What is the use of DESC in SQ#J

    o /ASB DESC can 6e used to descri6e a schema; or arrangerecords in descending order.

    ow do 7ou 5nd the num6er of rows in a !a6le

    o /ASB SE#EC! C%*A!0 L4%- !/2#EGA/-E

    What is Cartesian product in the SQ#J

    o /ASB / Cartesian product returns all the rows in all the ta6leslisted in the 9uer7. Each row in the one ta6le is paired with allthe rows in each of the rest of the ta6les. !his happens whenthere is no relationship de5ned 6etween ta6les.

    What is a viewJ What is materiali8ed >iewJ What is the di:erence6etween view and materiali8ed viewJ

    o /ASB >iews are virtual ta6les 6ased on a 9uer7 that can 6e

    reali8ed 6ased on multiple ta6les 67 containing com6ined datafrom each of them. -ateriali8ed views are the same as viewse(cept the7 have to 6e manuall7 refreshed to contain updateddate. >iews are updated automaticall7 whenever an underl7ingta6le is modi5ed.

    Can 7ou insert data into a viewJ

    o /ASB Nes.

    What is a merge statementJ What is the re9uirement for a mergestatementJ Is ?V necessar7 for mergeJ

    o /ASB !he -E4E statement is used to select rows from one ormore sources for update or insertion into a ta6le or view. Nou canspecif7 conditions to determine whether to update or insert intothe target ta6le or view. Nou must have the IASE4! and *?D/!Eo6Pect privileges on the target ta6le and the SE#EC! o6Pectprivilege on the source ta6le. !o specif7 the DE#E!E clause ofthe mergeGupdateGclause; 7ou must also have the DE#E!E

    o6Pect privilege on the target ta6le. /nother re9uirement is 7oucannot update the same row of the target ta6le multiple times inthe same -E4E statement; so for this to to ta

    o /ASB C/4 is a 5(ed length data t7pe. >/4C/4 is a varia6lelength data t7pe and can free up unused space if possi6le.

    What is the A># statementJ ow is it di:erent from decodeJ Is itpossi6le to implement A># with DecodeJ

    o /ASB !he A># statement sa7s if LIE#DGA/-E is A*##; assignvalue UB A>#0LIE#DGA/-E; 4E?#/CE-EA!G>/#*E. It is di:erentfrom DEC%DE in that DEC%DE has an ifthenelse structure. Nes;A># can 6e implemented 67 DEC%DE usingBDEC%DE0LIE#DGA/-E; A*##; 4E?#/CE-EA!G>/#*E

    Di:erence 6etween C/SE and DEC%DEJ

    o /ASB DEC%DE can onl7 wor< with scalar values. C/SE can wor/# partitions; whichmoves part of functionalit7 solved currentl7 67 E!# prewrappersto default processing of 4D2-S de5ned in Data dictionar7metadata 0automatic partition creation.

    What is meant 67 anal78ing ta6lesJ

    o /ASB /nal78ing a ta6le involves collecting and interpretingstatistics on a ta6le such as the followingB

    Collect or delete statistics a6out an inde( or inde(partition; ta6le or ta6le partition; inde(organi8ed ta6le;cluster; or scalar o6Pect attri6ute.

    >alidate the structure of an inde( or inde( partition; ta6leor ta6le partition; inde(organi8ed ta6le; cluster; or o6Pectreference 04EL.

    Identif7 migrated and chained rows of a ta6le or cluster.

    What is oracle hintJ Is the hint a command or %racle uses it optionall7J

    o /ASB / hint is code snippet that is em6edded into a SQ#statement to suggest to %racle how the statement should 6ee(ecuted.

    AoteB ints should onl7 6e used as a lastresort ifstatistics were gathered and the 9uer7 is still following asu6optimal e(ecution plan.

    What is an E(plain ?lanJ

    o /ASB /n E(plain ?lan is an ordered set of steps used to access ormodif7 information; 6ased on a 9uer7; while estimating the timeand cost of processing.

    Di:erence 6etween hash and nested loop PoinsJ

    o /ASB ash Poins are used for Poining large data sets. !heoptimi8er uses the smaller of two ta6les or data sources to 6uild

    a hash ta6le on the Poin

  • 7/23/2019 Analyst Interview Questions - AMAZON


    o !he di:erence is the performance in which these Poins areconducted. ash Poins are optimal when Poining large su6sets ofdata together; where as nested loops are more e@cient forsmaller datasets that prefera6l7 has an inde( to use. Lor theDW; hash Poins are generall7 recommended as most ta6les arenot small enough to utili8e the nested loop Poin e@cientl7.

    [edit][hide] E!#

    1. /dd in world wide reporting. ow would that a:ect 7our E!#J

    o /ASB Nour E!# will then have to 6e adPusted to ensure that thedata is availa6le for reporting; 6ased on the di:erent time 8ones.

    . iven a 6illion row ta6le; ow do 7ou add a new column and 6ac

    ". dedupe R

    o /ASB sort merged5le X uni9

    $. pipes

    o /ASB ?ipes are a function of te(t 5ltering in #inu( that can 6eused to construct a pipeline of commands where the output fromone command is piped or redirected to 6e used as input to thene(t. *sing pipelines in this wa7 is not restricted to te(t streams;although that is often where the7 are used.

    &. remove a

    o /ASB grep is used to search for patterns in a 5le; where as; 5ndis used to search 5les or directories.

    What is redirectionJ

    o /ASB 4edirection is when 7ou change the standard input andoutputs of a command to a userspeci5ed location. ?ipes aregenerall7 used for redirection.

    What is pipingJ

    o /ASB ?iping is when 7ou are redirecting standard inputs andoutputs of a command 67 using pipes.

    Lind a pattern in a 5le

    o /ASB grep is used to search for patterns in a 5le.

    Count the num6er of lines in a 5le with a pattern given

    o /ASB grep c ZpatternZ 5le.t(t

    iven that "rd column is the primar7

    [edit][hide] Data -odeling

    The following are ust definitions. Try to provide a real(life pro#lem, like how would

    model so you can report on delay times #etween order state statuses ( pending, success,

    error, etc.

    1. What are the primar7 the di:erences 6etween a transactional data6asevs a data warehouse data6aseJ

    1. !ransaction Data6ase is 4elational Data6ase with the normali8edta6le; whereas Data Warehouse is with denormali8ed ta6les.

    . !ransaction Data6ase is highl7 volatile. Designed to maintaintransactions of the 6usiness Where Data Warehouse is nonvolatile with periodic updates.

    ". !ransaction Data6ase is %#!?. Data warehouse is for anal7sis.

    $. !ransaction Data6ase is functional data. Data Warehousedata6ase is su6Pect oriented.

    . Di:erentiate ?rimar7 Ve7 and ?artition Ve7J

    1. /ASB ?rimar7

    . ?rovide a direct and intuitive mapping 6etween the6usiness entities 6eing anal78ed 67 end users and theschema design.

    ". ?rovide highl7 optimi8ed performance for t7pical star9ueries.

    $. /re widel7 supported 67 a large num6er of 6usinessintelligence tools; which ma7 anticipate or even re9uirethat the data warehouse schema contain dimensionta6les.

    . SnowKa

    "rd normal form. E4 model contains normali8ed data where asDimensional model contains denormali8ed data.

    Descri6e the normal formsJ What is 2CALJ nd normal formJ "rdnormal formJ

    o /ASB !he normal forms of relational data6ase theor7 providecriteria for determining a ta6le=s degree of vulnera6ilit7 tological inconsistencies and anomalies.

    2o7ceFCodd normal form 02CAL represents a ta6le whereever7 nontrivial functional dependenc7 in the ta6le is adependenc7 on a super

  • 7/23/2019 Analyst Interview Questions - AMAZON


    ". ow do 7ou handle man7 to man7 relationships in star schema.

    o /ASB %ne wa7 is 67 using 6ridge ta6les that holds at least the foreign

    $. /dvantages of using %racle vs other data6ase s7stems

    o !he advantages ma7 di:er; depending on which data6ases7stem is 6eing compared to %racle. Each data6ase s7stem wasdesigned with speci5c advantages and disadvantages that ma7outweigh or downpla7 the advantages of %racle 0which alsodepends on the intended application of the data6ase s7stem.

    [edit][hide] /rchitecture and design

    1. We get clic< stream data on a dail7 6asis from source team. We needto design a data mart for storing and 9uer7ing the raw data for a 7ear histor7. Dail7 volume is around '33 million rows. What

    1. ?rove or disprove the following e9uationB

    ( B oin(f(B!"" " left oin((!"" == B oin(f(B!"" ( left oin((!"" "

    where all the 5eld names of U; N; and \ are distinct. [/nswerB true.

    /rgument via settheoretic calculation. Incidentall7; %racle

    Corporation=s 9uer7plan optimi8er team is in a state of denial a6out

    this e9uivalence.]

    1. Suppose I have two entities in m7 D2B %6Pects; and !ags. Suppose alsothat I have a mapping ta6le %6Pect!ag which represents a man7toman7 relationship 6etween %6Pects and !ags. Aow I wish to 5nd; givena 5nite input list of !ag ids; the set of %6Pects which map to 0a anyofthe input tags [eas7]; and 06 allof the input tags [harder]. Can 7ou do0a and 06 with one 9uer7 eachJ

    o 0a

    o selet distint o5.o from Obets o oin Obet0a ot on o5id = ot5ob+id

    'here ot5ta+id in ( 7input list8 "

    o 06 !wo wa7s; with the second worth man7 more points than the5rst in terms of elegance. #et n6e the length of the input listB

    ?5 selet distint o5.E5 from Obets o oin Obet0a ot? on o5id =

    ot?5ob+idF5 oin Obet0a otE on o5id = otE5ob+idG5 oin 555@5 oin Obet0a ot7n8 on o5id = ot7n85ob+id5 'here ot?5ta+id = 7input ?8 and 555 and

    ot7n85ta+id = 7input n85 selet o5.J5 from Obets o oin (K5 selet ount( ta+id " ta+ount! ob+id?A5 from Obet0a

    ??5 'here ta+id in ( 7input list8 "?E5 roup by ob+id?F5 havin ount( ta+id " = 7n8?G5 " ot on o5id = ot5ob+id

    . Suppose I have a ta6leXwith a numeric 5eld N. ow do I write a single9uer7 with one numeric 9uer7 parameter such that if the parameter issome num6er m; the result will onl7 contain rows whereX.N m; andif the parameter is null; the result will include all rows of XJ
    F5 selet . from B 'here N = NL9( :! N " M. orale .MG5 selet . from B 'here N = I$N

    Hoin/ll emplo7ees from

    department LS

    select emp. from Emplo7ee emp;

    Department dept where emp.deptid

    dept.deptid and dept.deptname =LS=

    roup 67 Dept Aame withnum6er of emplo7ees

    select deptname; count0empid from

    Emplo7ee emp; Department dept whereemp.deptid dept.deptid group 67


    roup 67


    Dept Aame with

    num6er of emplo7ees


    select deptname; count0empid from

    Emplo7ee emp; Department dept where

    emp.deptid dept.deptid group 67

    deptname having count0empid 13

    %uter Hoin

    Dept Aame with

    num6er of emplo7ees

    include depts with no

    emplo7ees also

    select deptname; count0empid from

    Emplo7ee emp; Department dept where

    emp.deptid 0 dept.deptid group 67


    Su6 9uer7

    ighest salar7

    emplo7ee with dept


    select emp.; dept. from Emplo7ee

    emp; Department dept where

    emp.deptid dept.deptid and salar7

    0select ma(0salar7 from Emplo7ee

    Self Poin Emp name M -gr name

    select; from

    Emplo7ee emp; Emplo7ee mgr where

    emp.mgrid mgr.empid

    Self Poin/ll emplo7ees reporting

    to /neesh

    select from Emplo7ee emp;

    Emplo7ee mgr where emp.mgrid

    mgr.empid and =/neesh=



    Emplo7ees with salar7

    more than their


    select from Emplo7ee emp

    where emp.salar7 0select mgr.salar7

    from Emplo7ee mgr where emp.mgrid



    al 9uer7

    /ll emplo7ees reportingto 4am7a 0directl7 or


    select empname; mgrid from Emplo7eestart with empname =4am7a= connect

    67 prior empid mgrid

    is nullLind the top most


    select emp. from Emplo7ee emp where

    mgrid is null

  • 7/23/2019 Analyst Interview Questions - AMAZON


    The a#ove will cover some #asic scenarios. 7f you want multiple oining condition may

    #e add another ta#le like address into the mi and create some oining conditions. ;an

    ask a#out 5L7T, 9GT 5L7T and other correlated su#query conditions.

    Ask some question regarding partitioning M say we have ta#les ! orders, customers.

    Grders has order date, performance issues M how to improve. hould arrive at partitioning

    #y date. 1ay #e one question a#out giving hints in sql query.


    A few 7nterview questions in sections

    [edit][hide] Statistics

    . *hat is the 2impson2s parado2+ ive an eample. followup! How might this parado

    occur in continuous distri#utions+

    [edit][hide] SQ#

    . uppose you are aggregating shippingNaddresses over customersO each customer has a

    customerNid and each address has an addressNidO customers may have multiple shipping


    *e want to aggregate shipping address 6ip codes up to customers to choose a

    2representative2 6ip code for each customer that can #e used for model #uilding.

    There are three ta#les

    purchases has customer purchases including shippingGaddressGid0

    Suppose I have two entities in m7 D2B %6Pects; and !ags. Suppose alsothat I have a mapping ta6le %6Pect!ag which represents a man7toman7 relationship 6etween %6Pects and !ags. Aow I wish to 5nd; givena 5nite input list of !ag ids; the set of %6Pects which map to 0a anyofthe input tags [eas7]; and 06 allof the input tags [harder]. Can 7ou do0a and 06 with one 9uer7 eachJ

    o 0a

    o selet distint o5.o from Obets o oin Obet0a ot on o5id = ot5ob+id

    'here ot5ta+id in ( 7input list8 "

    o 06 !wo wa7s; with the second worth man7 more points than the5rst in terms of elegance. #et n6e the length of the input listB

    ?5 selet distint o5.E5 from Obets o oin Obet0a ot? on o5id =

    ot?5ob+idF5 oin Obet0a otE on o5id = otE5ob+id

    G5 oin 555@5 oin Obet0a ot7n8 on o5id = ot7n85ob+id5 'here ot?5ta+id = 7input ?8 and 555 and

    ot7n85ta+id = 7input n85 selet o5.J5 from Obets o oin (K5 selet ount( ta+id " ta+ount! ob+id?A5 from Obet0a??5 'here ta+id in ( 7input list8 "?E5 roup by ob+id
    ?F5 havin ount( ta+id " = 7n8?G5 " ot on o5id = ot5ob+id

    . Suppose I have a ta6leXwith a numeric 5eld N. ow do I write a single9uer7 with one numeric 9uer7 parameter such that if the parameter issome num6er m; the result will onl7 contain rows whereX.N m; and

    if the parameter is null; the result will include all rows of XJ

    F5 selet . from B 'here N = NL9( :! N " M. orale .MG5 selet . from B 'here N = I$N

    ive me a case where I would want to use a hash ta6leJ

    What is the time comple(it7 of retrieving an element from hash ta6leJ

    ive me a rege( to match a 13digit phone num6er of the form &&&&&&&&&&.

    Write a method to print out a 6inar7 tree=s nodes in levelorder.


    Lind Ath element from the last in a lin

    1 Competencies

    o 1.1 Data Engineering

    o 1. Data -odeling and Design

    o 1." Data6ase Concepts

    o 1.$ Coding and ?ro6lem Solving

    o 1.& iring -anager

    o 1.' 2ar 4aiser

    o 1.+ DW rid

    o 1., Competenc7 Interviewer ?ool

    ollowing are the competencies that are identified that each person should focus on for

    '* 'ata 5ngineer role. Before looking into the competencies, please a#ide #y the


    ?lease ma

    o E(pect for partitions; e(change partitions.

    o / DE II and DE III should 6e aware of impact to inde(es;glo6al)local.

    !a6les in three Clusters out of s7nc; how will 7ou correct itJ

    o E(pect for more clarif7ing 9uestions li

    ?artitioning concepts

    ?arallelism concepts

    Consistent reads 0orasnapshot too old errors

    Inde(es; ->s etc etc..

    Distri6uted Data6ases 0pros and cons

    [edit][hide] Coding and Problem Solving

    This includes giving candidates pro#lems and o#serving the approach and -) coding

    skills for the same. 1y recommendation will #e start off with simple -) coding skills to

    medium to comple pro#lems that requires intermediate designs and implementing a#ove

    with -) code as well. Kou can also give pro#lems that requires procedural coding

    /")$-) programming4. 7n -), please G#serve for minimal scans, effective oins, not

    too many su#queries, set operators, temporary ta#les, *ith ta#les etc..

    5amples include!

    Nou can start o: with !op 13 salaries in a emplo7ee ta6le

    Self Poin t7pe of 9uestions; emplo7ees manager relation in sameemplo7ee ta6le

    Hoins; outer Poins

    Case when statements)decode

    /nal7tical functions 0lag lead; ran

    Data structure usage in ?#)SQ# programs 0Cursors; ta6les; arra7s etc.

    [edit][hide] Hiring Manager

    ?roPect management

    cult 5t

    - can also pic< an7 sotes >otes " >otes >otes



    Total 7nterviewers ! A, B, ;, ', H1, B8

    '5 ( A B

    '1 ( ; '

    ;oding ( A B ;

    'B ;oncepts ( ' H1

    H1 8ound ( H1 B8B8 ( B8 H1

    o we need @ onsite interviewers Q a H1 Q a B8. /H1 should do one of the

    competencies as well4.
    and PS








    %M!B& s'ill



    MohanN N A N "


    Ara&alN N N N "


    1.. 4eporting

    1.." SQ#

    1..$ Data -odelling

    1..& *ni(

    1..' %racle D2 !echnolog7

    1..+ Data Warehousing

    1.., Ess6ase

    ee Ama6onNAnalyticsN'5N7nterviewsfor summary of typical '5 interview for B7


    [edit][hide] %utline of ?hone Screen

    1. & min Introduction 0hello and 9uic< Zwho 7ou areZ; descri6e Po6position

    . 13 min /s< a6out 6ac

    . What is pivotingJ ow will 7ou write a pivoting s9lJ

    ". What is a dash6oardJ

    $. What is scorecardingJ

    &. E(plain the Dimension ierarch7_

    '. %2IEE ?roduct overview

    +. 4e9uest processing Kow in %2IEE; role of each la7er

    ,. Di:erent t7pe of cacheJ

    . #evel 6ased measures and ?re:ered drill path

    13.What is shared logon propert7 in the Connection pool setup

    11.Connection pool optimi8ation

    1.Di:erence 6etween online and oOine repositor7

    1".Steps for -*D development

    1$.Steps for #D/? setup

    1&.What is uided navigation and how it wor

    +. given order and order items ta6les; select customer ids of customerswho placed orders with more than " items 0having or su69uer7

    ,. What is the use of DESC in SQ#J

    . ow do 7ou 5nd the num6er of rows in a !a6leJ

    13.What is Cartesian product in the SQ#J

    11.What is a viewJ What is materiali8ed >iewJ What is the di:erence6etween view and materiali8ed view

    1.Can 7ou insert data into a viewJ

    1".What is a merge statementJ What is the re9uirement for a mergestatementJ Is ?V necessar7 for mergeJ

    1$.What is dualJ Is it a ta6leJ if so what columns does it haveJ Whats thedata t7peJ


    o Descri6e di:erent Poins

    o given order and order items ta6les; select customer ids ofcustomers who placed orders with more than " items 0having orsu69uer7

    create 6uc

    $. What is a t7pe dimensionJ ow man7 t7pes are thereJ

    [edit][hide] 'ni(

    1. What does ls doJ

    . If a 5le has permissions 333; then who can access the 5leJ

    ". What is the di:erence 6etween grep and 5nd commandsJ

    $. What is redirectionJ

    &. What is pipingJ

    '. ow would 7ou dedupe a te(t 5leJ

    +. ow do 7ou view inuse portsJ

    [edit][hide] )racle DB $echnolog

    1. What is di:erence 6etween *AIQ*E and ?4I-/4N VEN constraintsJ

    . Di:erentiate 6etween !4*AC/!E and DE#E!E

    ". Di:erentiate 6etween IA and EUIS!SJ Which is faster IA or EUIS!SJ

    $. What is the di:erence 6etween *AI%A and *AI%A /##J

    &. Di:erence 6etween C/4 and >/4C/4J

    '. What is the A># statementJ ow is it di:erent from decodeJ Is itpossi6le to implement A># with DecodeJ

    +. what is C%/#ESCE functionJ

    ,. Di:erence 6etween C/SE and DEC%DEJ

    . Is there an7 wa7 we can change the column name in a ta6le

    13.Which is faster Insert or DeleteJ

    11.Can a primar7

    1".What does 4%##2/CV doJ

    1$.What are partitionsJ

    [edit][hide] Data Wareho*sing

    1. What is the data t7pe of the surrogate

    ,. What is -/U#J

    . What is -DUJ

    13.What is aggregationJ

    11.Wh7 is aggregation neededJ

    1.If new data is added to the cu6e; without adding new dim mem6ers; isreaggregation re9uiredJ

    1".What is 9uer7 6ased aggregation and stop value 6ased aggregationJ