Linear Interpolation Using Microsoft Excel Tables 2


Linear interpolation according to Wikipedia is a method of curve fitting using linear polynomials to construct new data points within a discrete set of known points. Or more appropriately defined as getting the value in a numerical table which almost always lies somewhere between known values when you are looking for a result in a hurry.

Getting by with finding the slope and using y = mx + b was fine during university, as courses were short-term and you moved on. Out in the workforce, you may find yourself needing to interpolate to get a rating on a flange or valve and usually not just looking up one but many for all the equipment in a register which is continually changing throughout the project life-cycle.

Back when I started my first job in Mississauga Ontario, I was tasked with going through the spreadsheets and checking equations. That was the first time I came across the VLOOKUP function. I used that function over the following ten years until I stumbled upon the INDEX and MATCH beautiful combination.

I found myself continually needing to lookup values for calculations or for reporting. Once I discovered the INDEX and MATCH, I discovered more uses every day. One of the primary applications was interpolating data from a table. I worked my way around a solution and discovered the equation was enormous and truthfully daunting. I knew that while it worked, I was worried how I could recreate it for other works. Hence this post. I will step through the process of setting up an input table and getting a results table with minimal effort and an equation that can be copied right from here into a cell and voilà.

I have broken up the post in 2 section, the first part I will go through 6 steps from a blank sheet to a fully functional formatted table with linear interpolated results. At which point, feel free to knock off. In the second part, I will discuss in detail all intermediate calculations to show precisely how the equation works. You’d have to be pretty keen to read that part as it is long-winded but if you are like me, and want to know how something is working before putting it into use, then I recommend having a read through it.

Written Equation and Graph

Linear Interpolation Equation and Graph

Part 1: 6 Steps to Interpolating Like a Pro

1 – Compile your data

Compile your data into your spreadsheet.

Raw Data

Raw Data for Finding Values During Interpolation

2 – Convert your Data into a Table

a. Select your data > Data > Insert > Table (Click the checkbox the My Table has headers

Windows with arrows

Converting Data to a Table

b. Click the Table > Design > Table Name > Enter T_Inputs > Press Enter

Excel ribbon for Table Name

Renaming an Excel Table Name

c. Data Formatted as a Table

Raw data in Table Format

Raw Data Converted to an Excel Table 

3 – Set up your Results Data

Create a table with three columns and name them as shown below (can change later). Add a few rows of info in the first two columns.

Results Data Table

Set up of the Results Data

4 – Convert you Results Data into a Table

Repeat step 2 BUT name this Table T_Results

Results Data in a Table

Results Table Converted to an Excel Table

5 – Copy the Equation Into your Results Table

Copy the formula and paste it into any cell in the y column.

=IFERROR( IF( [@[x_axis]] = MAX(T_Inputs[x_axis]), INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis], 1 ), MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0) ), ( INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis],1), MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0 ) ) * ( IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1) ) – [@[x_axis]] ) + IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1, MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0) ) ) * ( [@[x_axis]] – INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis],1)) ) ) / ( IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1 ) ) – INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1) ) ) ), “Check”)

Note: This is an array function (discussed below), so while in the active cell press the following keys Ctrl+Shift+Enter (Windows) or control+shift+return (Mac). This will put { } brackets around the equation denoting the array. If you don’t have the { } brackets, the results will not be correct.

Results Table

Results Table showing the Interpolated Values

6 – Rename your headers

Rename the header of your tables to suit your needs. The equation is updated automatically. That is the beauty of using tables.

I used data from ASME B16.34 Valves – Flanged, Threaded, and Welding End Table 2-1.1 Ratings for Group 1.1 Materials as my example.

Tables with Renaming applied

The Input and Results Tables Renamed

Part 2: Walkthrough from Start to Final Equation

My example extends interpolating one column to multiple columns using the INDEX and MATCH combo. Hence this can be expanded further to bilinear interpolation, which I will leave for another day.

I set up the raw data as explained above. I highlighted the table to help show the parts of the table being the header, x and y-axis data.

Raw Data Table

Raw Data Highlighted Showing the Header, x & y values

The following results table shows each step of calculations, which I have combined into the final equation used above. I will discuss each calculation below.

Complete Table of Equations

Step through each Variable of the Interpolating Equation

I have highlighted a row as an example for the discussion below.

Colour coded tables

Color Coding of inputs of 900 and 100 for a result of 139.8

data_col

=MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0)

This finds the column number in the table based on the input in data_columns.

Table headers are in text format, hence the VALUE function is necessary to convert it from text to a numerical value that can be compared to the input in data_columns. 

=MATCH(900, VALUE({“x_axis”,”150″,”300″,”600″,”900″,”1500″,”2500″,”4500″}), 0) =
=MATCH(900,{#VALUE!,150,300,600,900,1500,2500,4500},0) = 5 (5th column)

As this function is converting text to values, it has to be entered as an array. To enter a formula as an array, you have to be in the active cell and press the following keys: Ctrl+Shift+Enter (Windows) or control+shift+return (Mac).

x0_Row

MATCH([@[x_axis]], T_Inputs[x_axis], 1)

This finds the row number in the table based on the input in the x_axis column.

=MATCH(100,{-29;38;50;100;150},1) = 4 (4th row)

x1_Row

MATCH([@[x_axis]], T_Inputs[x_axis], 1) + 1

This finds the next row number in the table based on the input in the x_axis column. This is done by adding the + 1.

x0

INDEX(T_Inputs[x_axis],MATCH([@[x_axis]],T_Inputs[x_axis],1))

We have found the row; we now use INDEX to get the actual value at that row position.

=INDEX({-29;38;50;100;150}, 4) = 100

x1_err

INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1) + 1 )

This formula finds the actual valve of x1.

=INDEX({-29;38;50;100;150}, 5) = 150

Having a look at data_columns 1500 and x-axis of 150, the x1_err is #REF!. WT# is that! Looking closely, the x1_Row is 6, and Row 6 of the input data is empty (outside the table limits). This happens at the bottom of a table. To capture this, we need to do some checks as shown next for x1.

=INDEX({-29;38;50;100;150}, 6) = #REF!

x1

IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1) +1 ))

This performs the same function as x1_err above but with a check to see if it is at the bottom row so that we can get the correct value.

We check if the x-axis input is equal to the maximum value of the x_axis in the inputs table (i.e. for an ascending table the maximum value will be at the bottom). If this is true then we want the value x_axis in the bottom row of the table.

=IF(100=MAX({-29;38;50;100;150}), MAX({-29;38;50;100;150}), INDEX({-29;38;50;100;150},5)) = 150 (900 and 100 example)

=IF(150=MAX({-29;38;50;100;150}), MAX({-29;38;50;100;150}), INDEX({-29;38;50;100;150},6)) = 150 (1500 and 150 example)

y0

INDEX(T_Inputs, [@[x0_Row]], [@[data_col]] )

This function finds the initial y value in the y-axis data for the corresponding data_columns and x_axis inputs.

=INDEX({-29,19.6,51.1,102.1,153.2,255.3,425.5,765.9;38,19.6,51.1,102.1,153.2,255.3,425.5,765.9;50,19.2,50.1,100.2,150.4,250.6,417.7,751.9;100,17.7,46.6,93.2,139.8,233,388.3,699;150,15.8,45.1,90.2,135.2,225.4,375.6,676.1},4,5) = 139.8 (y-axis data – row 4, column 5)

y1_err

INDEX(T_Inputs, [@[x1_Row]], [@[data_col]] )

This function finds the next row y value.

=INDEX(y axis data, 5, 5) = 135.2

As discussed in x1_err, when the value is equal to the bottom row then the next row is a blank row and will give the #REF!. We have to peform the same check to get around the error.

y1

IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs, [@[x1_Row]], [@[data_col]] ) )

To get around the bottom row error we check if the value is at the bottom of the table, then we take that y value, and if not we take the next row value.

y_err

=( [@y0] * ( [@x1] – [@[x_axis]] ) + [@y1] * ( [@[x_axis]] – [@x0] ) ) / ( [@x1] – [@x0] )

Now we have the 2 known data points, we can now solve the linear interpolating equation to find the value that falls in between.

Back to 1500 and 150, the x0 and x1 are the same value after we apply our error check. The difference between the two is zero, and then we divide by zero and get the dreaded #DIV/0!. Now what? Another error check and fix.

y_i

=IF( [@[x_axis]] = MAX(T_Inputs[x_axis]), [@y0], ( [@y0] * ( [@x1] – [@[x_axis]] ) + [@y1] * ( [@[x_axis]] – [@x0] ) ) / ( [@x1] – [@x0] ) )

We have to check if we are at the bottom of the table, and if we are then the value is the value on the bottom row. So we do the check as discussed above to check if we are on the bottom row and take that y value, and if not we apply the interpolating equation. #DIV/0 fixed!

y_

=IF( [@[x_axis]] = MAX(T_Inputs[x_axis]), INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis], 1 ), MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0) ), ( INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis],1), MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0 ) ) * ( IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1) ) – [@[x_axis]] ) + IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1, MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0) ) ) * ( [@[x_axis]] – INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis],1)) ) ) / ( IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1 ) ) – INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1) ) ) )

Now we just compile all the intermediate equations into a final equation which only uses the input values from data_columns and x-axis. That #REF! appears again on some.

That #REF! is a good thing. If we enter a value greater than those in the header or the x-axis, then we don’t want to apply an interpolation because those values would be outside the range of the input data. So this error is good and lets us know that what we are trying to find something outside the limits of the input data. But the #REF! is a bit intimidating and could be switched out to something a little more elegant.

y

=IFERROR( IF( [@[x_axis]] = MAX(T_Inputs[x_axis]), INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis], 1 ), MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0) ), ( INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis],1), MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0 ) ) * ( IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1) ) – [@[x_axis]] ) + IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs, MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1, MATCH([@[data_columns]], VALUE(T_Inputs[#Headers]), 0) ) ) * ( [@[x_axis]] – INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis],1)) ) ) / ( IF([@[x_axis]] = MAX(T_Inputs[x_axis]), MAX(T_Inputs[x_axis]), INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1 ) + 1 ) ) – INDEX(T_Inputs[x_axis], MATCH([@[x_axis]], T_Inputs[x_axis], 1) ) ) ), “Check”)

Let’s wrap the entire equation with an IFERROR function and change that #REF! to “Check”.

That’s it. If you made it this far, you are pretty darn keen, or just passionate like me about the how to utilise the tools at hand to improve your workflow.

If you have any questions or spot any errors or know a better way, then please feel free to drop a comment below.

Other Resources

I didn’t want to get too deep into the functions as the post had plenty to digest. If you would like to learn more about the functions then check out these great resources:


2 thoughts on “Linear Interpolation Using Microsoft Excel Tables

Comments are closed.