Using the Power of Excel… - American Library...

Post on 15-Mar-2021

0 views 0 download

Transcript of Using the Power of Excel… - American Library...

Using the Power of Excel…

To help with cutting a budget

About

Karen Harker

MLS, MPH

Collection Assessment

University of North Texas

About you

Individual66%

Group34%

Registrations

34%

32%

12%

10%

12%

Registrants

Academic-Public

Academic-Private

Public Library

Community/Technical

Other

Where you are

Objectives

Poll on your Excel skills

Take advantage of Excel’s features & functions

Key features & functions to cover:

• VLookup() – for organizing data

• PercentRank.inc() – for ranking items

• Conditional formatting – for visualizing data

• Pareto distributions (80/20) – for evaluating Big Deals

About the Webinar

Intermediate to advanced

Function Wizard

Pauses built-in

Criteria Used

Purpose

• Select resources to cut from budget based on evidence of value.

Objective

• Rank resources from least to most value based on criteria• Price• Use• Cost per use• Pareto number (80/?)• Inflation factor• Subject librarians’

ratings

About the Data

Usage

• 3 year average annual usage

• Highest & best measure of usage for type of resource

Pareto Distribution

• Distribution of usage across titles in a package

• Benchmark: 80% of usage from 20% of titles

• Comparisons of the second number (80/??)

Highest & Best Uses

• Full-text downloadsIndividual journals

• Full-text downloads• Distribution of usage across titles

Ejournal packages

• Items streamed/full-text downloadsAudiovisual

• Abstracts/record viewsLiterature databases

• Abstract/record views• Full-text downloads

Full-text databases

• Abstract/record viewsOnline reference (miscellaneous)

Data Sources

Integrated Library System (ILS)

• Sierra

• Bibliographic information

• Order record number (“o999999”)

• Key Identifier

COUNTER Reports of Usage

• JR1: Full-Text

• DB1: Abstracts

Export

• Excel

• CSV

Functions

Making Excel do the Work

Functions

What are functions?

• Mini programs that return a value

What are they made of?

• Equal sign (=)

• Tag or label

• Inputs or parameters, in parentheses and separated by commas

Example

• =Sum($E$2:E10)

• adds the numbers or the values of cell references and returns the total.

Order matters

• The order of the inputs or parameters matters.

Two Key Functions

VLOOKUP()

• Look something up

PERCENTRANK.INC()

• Distribution

What does VLOOKUP() do?

Master List

• ID• Title• Price• Usage• Inflation• Ratings

Resource Type

• ID• Title• Price• Usage• Inflation• Ratings

� V is for Vertical

� Looks down the first column of a list for a specific value, then…

� …returns the value of a specific column in that row.

� Allows you to link lists by an ID number

VLOOKUP() Parameters Decoded

=VLOOKUP(A2, 'Master List'!$A$1:$D$305, 2, FALSE)

Lookup_value

Table_array

Col_index_num

Range_lookup

What are you looking up?

Number or cell reference

Where are you looking it up?

The range that has the data you are needing.

File or Worksheet

Cell Range

VLOOKUP() Parameters

Colu

mn index

#

Looku

p r

ange

Ran

ge looku

p

Looku

p v

alueWhat are

you looking up?

Number or cell reference

Where are you looking it up?

The range that has the data you are needing.

What do you want to return?

Column Number

These are numbers, NOT letters.

How precise do you want to be?

True -Approximate match is OK

False - Only exact match.

Simple Example

� You want to look up an ID(#39) and return the name:� Lookup value - 39

� Lookup range - A1:C10

� Column index number - 3 (column C or Full Name)

� Range lookup - False (exact matches only) =VLOOKUP(39,A1:C10,3, False)

returns "Suroor Fatima"

Column C is the 3rd

Column

Challenges

� Challenges questions: what do these return?� =VLOOKUP(42,A1:C10,2,False)

� Operations

� =vlookup(35,A1:C10,3,False)

� Yossi Banai

� =vlookup(38,A1:C10,2,False)&", "&vlookup(38,A1:C10,3,False)

� Operations, Axel Delgado

� =vlookup(54,A1:C10,3,False)

� #N/A

IFERROR()

� =IFERROR(some function, what to return if there is an error)

� Embedded functions

� Excel processes innermost functions and works outward

� =IFERROR(VLOOKUP(52,A1:C10,3, False),”N/A”)

� If “52” can’t be found, returns “N/A”

Applying VLOOKUP()

Setting up the files

Multiple Files or Worksheets

Master List Columns Resource Type Worksheets

A. Order # (ID)

B. Title

C. Renewal Price

D. Type

1. Filter Master List on Type

2. Copy & Paste Order #� From Master List

� To Worksheet

3. Use VLOOKUP() � Title

� Renewal Price

4. Add Other Data� Usage

� Ratings

Master List Worksheet

ID in first column

Resource Type

• Individual titlesEjournal

• Literature• A&I or Full-text

Database

• Big DealsPackage

• Online reference source

Reference

Resource Type Worksheets

1. Filter Master List on Type

2. Copy Order #

3. Paste in resource type worksheet (E-journal)

Use VLOOKUP() in Resource Type sheet to

get Title from Master List

Title = B =

Col. 2

Filling in the cells

� Copy & paste the formula is quick & easy - BUT…

� Use Relative cell references for the Lookup_value – A2

� Use Absolute cell references for the Table_array- $A$1:$D$305

Use VLOOKUP() to Get Price

Price in 4th

Column

Use VLOOKUP() to Get Price

Add Usage Data to Resource Type

Worksheet

VLOOKUP() fromMaster List

VLOOKUP() toMaster List

Titles & Price from Master List to Resource

Type Worksheets

Master List

Ejournals

Database

Package

Usage Data & Rankings to Master List from

Resource Type Worksheets

Master List

Ejournal

Database

Package

It’s all relative

Using PercentRank.inc() to Compare Resources

Comparing Resources Against Each Other

Relativity

• How a resource "stacks up" against others of its kind.

Sort by some value

• CPU• Usage• Cost

Distributions vary

• Wide• Inconsistent

Use percentiles

• Understand the distribution

Check it out

0

2

4

6

8

10

12

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 200 250 500 1000 More

Fre

qu

en

cy

Bin

Histogram of Usage

Title A has 45 uses

Title B has 155

uses

Where does Title A fall relative to all the other titles? Title B?

PERCENTRANK.INC()

Returns

• The rank of a value as a percentage

• 0 to 1.00 inclusive

Parameters

• Array: Column of interest

• X: The value of interest

• Significance: # of significant digits

Example

• =PercentRank.inc(E:E,E2,2)

PercentRank() of CPU2 digits past

decimal

Compare the Ranks of Different Measures

50th percentile for usage

80th percentile for CPUCPU:

Lower is better.

Usage: Higher

is better.

Directions of Comparisons

Comparisons should be in the same direction

• High = good• Low = bad

Decide…

• Low = good• High = bad

…Or

Reverse directions, when needed

Original Ranks

• Low is goodCost

• High is goodUse

• Low is goodCPU

• Low is goodInflation

• High is goodRatings

Transformed Ranks

• High is goodTransformedCost

• High is goodUse

• High is goodTransformedCPU

• High is goodTransformedInflation

• High is goodRatings

Compare the (Transformed) Ranks

1 minus % Rank for CPU

50th percentile for usage

20th percentile for CPU

Higher is Better

Efficiency of ‘big deals’

Distribution of Usage Across Titles Within a Package

Power Law Distribution

� In statistics, a power law is a functional relationship between two quantities, where one quantity varies as a power of another.

� Wikipedia

Pareto Distribution in Libraries

AKA The 80/20 Rule

• 80% of the usage is from 20% of the collection.

• 80% of the uses are from 20% of the users.

Efficiency of an Ejournal Package

• 80% of usage is from ??% of the titles.

• 20% is a benchmark.

• Higher is better.

1. List titles in package.

2. Gather usage data.

3. Sort by usage Z-A.

Title Name 2011 2012 2013 3yr. Avg.

Package 12, Title 45 3783 4094 4562 4146.33

Package 12, Title 57 1722 1226 1162 1370.00

Package 12, Title 29 1313 1351 1252 1305.33

Package 12, Title 53 1263 1242 1335 1280.00

Package 12, Title 43 1081 1255 1250 1195.33

Package 12, Title 50 1076 986 1364 1142.00

Package 12, Title 32 1572 918 765 1085.00

Package 12, Title 13 949 1156 1010 1038.33

Package 12, Title 20 740 921 1018 893.00

Package 12, Title 58 1002 805 789 865.33

Package 12, Title 31 970 902 680 850.67

Package 12, Title 9 568 675 1148 797.00

Package 12, Title 40 703 731 870 768.00

Package 12, Title 46 599 846 838 761.00

Package 12, Title 24 583 709 844 712.00

Package 12, Title 21 639 590 568 599.00

Package 12, Title 42 585 592 459 545.33

Package 12, Title 36 517 459 491 489.00

Package 12, Title 1 466 469 450 461.67

Calculations for Pareto Distribution

% of Uses % of Titles

� Cumulative sum ⁄ total uses

� =SUM($E$2:E2)/ SUM(E:E)

� Locate the value closest to your benchmark (e.g. 80%)

� Cumulative count ⁄ total # titles

� =COUNT($E$2:E2)/ COUNT(E:E)

� Read the value next to the benchmark % uses

Pareto DistributionTitle Name 3yr. Avg. % Uses % Titles

Package 12, Title 45 4146.33 17.59% 1.75%

Package 12, Title 57 1370.00 23.40% 3.51%

Package 12, Title 29 1305.33 28.94% 5.26%

Package 12, Title 53 1280.00 34.37% 7.02%

Package 12, Title 43 1195.33 39.44% 8.77%

Package 12, Title 50 1142.00 44.29% 10.53%

Package 12, Title 32 1085.00 48.89% 12.28%

Package 12, Title 13 1038.33 53.29% 14.04%

Package 12, Title 20 893.00 57.08% 15.79%

Package 12, Title 58 865.33 60.75% 17.54%

Package 12, Title 31 850.67 64.36% 19.30%

Package 12, Title 9 797.00 67.74% 21.05%

Package 12, Title 40 768.00 71.00% 22.81%

Package 12, Title 46 761.00 74.23% 24.56%

Package 12, Title 24 712.00 77.25% 26.32%

Package 12, Title 21 599.00 79.79% 28.07%

Package 12, Title 42 545.33 82.11% 29.82%

Package 12, Title 36 489.00 84.18% 31.58%

Package 12, Title 1 461.67 86.14% 33.33%

Title 45 has over 17% of uses.

In this package, 20% of titles account for

2/3 of total uses.

About 80% of uses are used by nearly

30% of titles.

Compare Distributions of All PackagesORDER # Title Renewal Price # Titles Cost/ Title 3 yr Avg Uses CPU Pareto %

o1044667 Package 13 $ 1,974.97 6 $ 329.16 69 $ 28.62 50%

o4518731 Package 26 $ 3,919.83 8 $ 489.98 1305 $ 3.00 50%

o3099891 Package 268 $ 7,214.26 14 $ 515.30 89 $ 81.06 50%

o3679408 Package 87 $ 4,168.51 41 $ 101.67 1482 $ 2.81 47%

o3462341 Package 17 $ 12,305.61 39 $ 315.53 4817 $ 2.55 45%

o3874291 Package 89 $ 2,383.44 7 $ 340.49 1577 $ 1.51 43%

o1638543 Package 240 $ 22,557.40 355 $ 63.54 13756 $ 1.64 35%

o3906115 Package 25 $ 15,400.53 39 $ 394.89 509 $ 30.26 34%

o4616935 Package 262 $ 217,544.85 599 $ 363.18 63401 $ 3.43 33%

o4203276 Package 28 $ 3,794.65 22 $ 172.48 685 $ 5.54 30%

o2978969 Package 12 $ 64,795.21 59 $ 1,098.22 23585 $ 2.75 28%

o4081791 Package 227 $ 55,241.67 315 $ 175.37 26803 $ 2.06 27%

o3014782 Package 126 $ 137,240.35 5766 $ 23.80 28400 $ 4.83 24%

o2741003 Package 280 $ 288,666.48 1718 $ 168.02 25830 $ 11.18 24%

o1653441 Package 9 $ 38,135.83 12 $ 3,177.99 6870 $ 5.55 23%

o380186x Package 260 $ 12,332.98 37 $ 333.32 6032 $ 2.04 23%

o3768284 Package 239 $ 3,661.66 42 $ 87.18 47 $ 77.91 21%

o4096083 Package 295 $ 485,336.56 1571 $ 308.93 75883 $ 6.40 21%

o3798161 Package 43 $ 55,446.98 437 $ 126.88 9035 $ 6.14 19%

o2612380 Package 177 $ 39,781.00 2062 $ 19.29 230620 $ 0.17 19%

o3933416 Package 20 $ 2,189.88 52 $ 42.11 2171 $ 1.01 17%

o3006785 Package 143 $ 116,987.74 110 $ 1,063.52 4789 $ 24.43 17%

o3244064 Package 301 $ 22,390.00 37 $ 605.14 292 $ 76.68 14%

o1745232 Package 5 $ 5,529.10 1249 $ 4.43 2463 $ 2.24 2%

Conditional formatting

Quick way to highlight outliers or visually represent distributions

Ways to Use Conditional Formatting

� Highlight based on a specific value� Usage Measure (e.g. Abstracts, FTD’s, etc.)

� Greater than .7, .3-.7, and lower than .3

� Visually represent distributions � A visualization of PercentileRank()

� CPU

� Pareto

Conditional Formatting CPU

Set the Conditional Formatting

• Highlight the CPU column

• Select Conditional Formatting->3 color scale

• Red –Yellow – Green (High – Medium – Low)

• Highest is red; Lowest is green

Highest & Lowest 10th Percentile

• Conditional Formatting->Manage Rules->Edit Rule

• Change “Highest” and “Lowest” to “Percentile”.

Conditional Formatting CPU

Conditional Formatting CPU

Changing Rule to Percentile

Change “Lowest” to “Percentile”

Change “Highest” to “Percentile”

Conditional Formatting CPU

Altogether, Now

Master List - Summary columns

• Use VLOOKUP to "grab" the summary data from your Resource Type worksheets

• 3-yr avg uses

• CPU

• Pareto Distribution (Packages only)

Use Conditional Formatting

• Highlight important text

• Visualize distributions

Compiled Master List

Imported from ILS VLookUp() from Resource Type Worksheets

Master List

Visualizing Use Rank by 3 categories.

CPU by Percentile

Rank

Caveats

Use Table Formatting

• Can name your tables

• Automatically copies & pastes formulas

• Easier to add columns

• Adjusts formulas for absolute & relative cell ranges

Don't rename your files

• References will not change

Save all of your files in one folder

• Preserves relationships

Use the same structure in all of the worksheets

• Easier to set up

What (I hope) you’ve learned

• for organizing dataVLookup()

• for ranking itemsPercentRank.inc()

• for visualizing dataConditional formatting

• for evaluating the efficiency of Big Deals

Pareto distributions (80/20)

Questions and Comments

� Karen.harker@unt.edu

� Libraries are for Use

� Librariesareforuse.wordpress.com

� UNT Faculty Profile

� Karen Harker in UNT Scholarly Works

� Charleston Pre-Conference Workshop:

� Keeping it Real: A Comprehensive and Transparent Evaluation of Electronic Resources

� Cost: $150

� Presenters:

� Karen R. Harker

� Laurel Crawford

� Todd Enoch