Email:      Pass:    Pass?
Close Window   
Free Ebook
Get Your Free Excel
Our Top 15 Excel Tutorials
Instant Access!

Subscribe for Free Excel tips & more!

Free Excel Forum

How Do I Calculate Outliers In Excel?

Forum Register
Search Excel Forum Posts, Tutorials, Macros, Tips, and More

Is there a way for excel to automatically determine outliers in a given data
set and exclude them from future calculations?

View Answers     

Similar Excel Tutorials

Stop Formula Calculation in Excel - Increase Worksheet Performance
I'll show you how to stop formulas and functions from performing their calculations and updating in Excel. This ma ...
Calculate the Difference Between Time Greater than 24 Hours in Excel
Ill show you how to calculate the difference between two times in Excel when that difference will add up to more t ...
Years, Months, Days Between two dates in Excel
I have found a rather convoluted method to Calculate the Number of Years, Months and Days between to Dates in Excel ...
FV Function - Get the Future Value in Excel
The Future Value function (FV) in Excel will return the future value of an investment based on a particular intere ...

Helpful Excel Macros

Calculate the Future Value (FV) of Compound Interest in Excel - UDF Macro
- This UDF (user defined function) calculates the Future Value of Compound Interest in Excel. The mathematical formula th
Filter Data to Show the Bottom X Percent of the Data Set in Excel - AutoFilter
- This free Excel macro filters data to display the bottom X percent of the data set in Excel. This is a great macro to u
Complete Guide to Printing in Excel Macros - PrintOut Method in Excel
- This free Excel macro illustrates all of the possible parameters and arguments that you can include in the PrintOut Meth
Filter Data in Excel to Show Only the Top X Percent of that Data Set - AutoFilter
- This Excel macro filters a set of data in Excel and displays only the top X percent of that data. This means the highes
Automatically Run a Macro at a Certain Time - i.e. Run a Macro at 4:30PM every day
- Automatically run an Excel macro at a certain time. This allows you to not have to worry about running a macro every da

Similar Topics


i have a scatter plot for which i want to insert a best fit line. however, there are some outliers within the data. i have manually identified these outliers and coloured them differently from the non-outliers, but what i would like to do now is to insert a best-fit straight line for the remaining good data, while still keeping the outliers on the scatter plot. is there a way to do that?

i found this post:
but do not understand how to "fit the trend line based on the new set of points, but graph the old set".

thanks for any help.

I've been racking my brain over this all weekend. Can you help me determine the best minimum time excluding outliers?

For example:

Dave 15
Jim 10
Mike 12
John 5
Jill 12
Kim 13
Joey 21
Ralph 6
Sue 20

I would consider 5, 6, 20, and 21 outliers. I need to remove them to show the minimum (or best time) which in this case would be 10.



does anyone know if I can get outliers from a histogram, or do I need to do a
scatter plot? I am using excel (most recent version) for mac.

I have lots of data, mostly triplicate results, with some outliers. I would
appreciate help on applying Thompson tau or other outlier removal technique
in Excel.

Hi!! I've been trying with no luck to come up with a way to:

a) find all the outliers in a range of a 1000 rows. The criteria is simply +/- average*2. The outlier can either be above or below.
b) colour the background of these outliers.
c) repeat a&b in the next column.

So in essence, I'm trying to find all values in a table that are twice (positive or negative) that of the average and highlight them.

A, B
4, 1
4, 1
4, 1
4, 4
4, 1
4, 1
4, 1
12, 1

In this case the outlier for A is 12 and the outlier for B is 4.

Any help, be it in a formula for a given cell, or VBA, is deeply appreciated.


Using the averaging approach and descriptive statistics can anyone tell me how do i find outliers? i know how to do it using the 5-number summary but cant figure out how to do it this way


hi all,

i have a range of numbers in C5:AL5
Not all of the cells are filled in, and some that are filled in are blank.

In AM5 I calculate the average of the data
and in AN5 the standard deviation where the value in the cells in the range are >0

Is it possible to calculate a standard deviation in another cell which excludes those values in the range which are 3 standard deviations away from the value in cell AN5.
Therefore, the standard deviation with the outliers.

If there is another way of doing this, i would be grateful for your help.


When evaluating a set of items for their popularity, I pull popularity metrics from two different sources, aggregating each score into an average. While I am satisfied with this metric (Let's call it the 'popularity score'), I'd like to create a simplified, more meaningful score.

My initial though was to create an index using the following formula:



Where D3 is the individual popularity score and D represents all popularity scores.

I discovered, however, that outliers would have a disruptive effect on the data. For instance, if there where several VERY popular items, other items with acceptable popularity scores would appear very low in relative terms.

Any ideas on how to account for the Outliers while still retaining the meaning in an easy to understand way?

Here is part of the data:

Here is my data (D3:D29):


A colleague suggest that I take the lognormal distribution and then take the antilog. Any idea how to implement, or if this is a sound approach?

Hello All,

I'm new to the forum and have not needed to do anything like this before but assume that excel can do this kind of thing.

I have tried to search for a solution everywhere.

OK, I have a large data set of triplicates in column F (example below):

F 1
INT#6 SYBR Unknown
25.944914 2 122 INT#6 SYBR Unknown 25.577063 3 123 INT#6 SYBR Unknown 24.799803 4 124 INT#6 SYBR Unknown 27.107956 5 125 INT#6 SYBR Unknown 26.962341 6 126 INT#6 SYBR Unknown 27.189684 7 127 INT#6 SYBR Unknown 26.185757 8 128 INT#6 SYBR Unknown 26.0001 9 129 INT#6 SYBR Unknown 26.207401 10 130 INT#6 SYBR Unknown 25.44769 11 131 INT#6 SYBR Unknown 25.850075 12 132 INT#6 SYBR Unknown 26.147488 13 133 INT#6 SYBR Unknown 24.232754 14 134 INT#6 SYBR Unknown 24.747488 15 135 INT#6 SYBR Unknown 24.813463 16 136 INT#6 SYBR Unknown 25.699005 17 137 INT#6 SYBR Unknown 25.42176 18 138 INT#6 SYBR Unknown 25.657625 19 139 INT#6 SYBR Unknown 25.596703 20 140 INT#6 SYBR Unknown 24.756275 21 141 INT#6 SYBR Unknown 24.952007 22 142 INT#6 SYBR Unknown 25.594326 23 143 INT#6 SYBR Unknown 26.192163 24 144 INT#6 SYBR Unknown 24.788391
so one triplicate is 121 to 123 (data in F1:F3) then the next set is 124 to 126 (data in F4:F6).
What I want to do is compare each number within the triplicate to the others within the same triplicate and determine which are +/- 0.5 from the others and label them as outliers.
I have come up with a poor solution that works for most instances but not when two values are 0.5 from each other, neither is marked as an outlier. In this case I would like to the formula to choose the value furthest from the median as the outlier.
I would also ideally like to mark the outliers by somehow highlighting in red.

my poor solution occupies the next 3 columns G,H and I and is very clunky:

G H I 1 =IF(OR($F2($F3+0.5)),1,0) =IF(OR($F2($F4+0.5)),1,0) =SUM(I2:J2) 2 =IF(OR($F3($F2+0.5)),1,0) =IF(OR($F3($F4+0.5)),1,0) =SUM(I3:J3) 3 =IF(OR($F4($F2+0.5)),1,0) =IF(OR($F4($F3+0.5)),1,0) =SUM(I4:J4) 4 =IF(OR($F5($F6+0.5)),1,0) =IF(OR($F5($F7+0.5)),1,0) =SUM(I5:J5) 5 =IF(OR($F6($F5+0.5)),1,0) =IF(OR($F6($F7+0.5)),1,0) =SUM(I6:J6) 6 =IF(OR($F7($F5+0.5)),1,0) =IF(OR($F7($F6+0.5)),1,0) =SUM(I7:J7)
The if the result in column I is 2 then the corresponding value in the data set is an outlier.

Can anyone help?
I hope my explanation of the problem is understandable.

This is really bugging me. If I can get the formula to work then It will save me days of trawling through data picking outliers by eye.

I eagerly await a solution and will try things of my own in the mean time but as you can see I am far from an excel master.


Hi everyone...

here is the situation...i need a formula for median to exclude the min and the max value ..that are above 0!!!

i tried this formula ...(ARRAY!) but i get Num error! ..any ideas?

suppose H28:H32 is the data range!


Is there a good way of excluding an outlier in an average calculation. In the example below will I exclude 1000 from the average-calculation.

The way to decide excluding-values can either be a percent based on the range or everything that is a higher than a user defined value. It can also be more than one outlier.
A user defined function is OK with me, if it is impossible to use the built-in functions. Any suggestion?




I am analyzing sales figures and need some assistance. I have a variety of products and a variety of territories where these products are sold. Additionally I have Gross Profit Margin figures for each product sold and many of the same products have differing GP % due to flexibility of sales agents setting the sale price.

Ideally I would love to find out what products are the most profitable, along with which areas these products sell for the most effectively.

However, after developing a Pivot Table with products in rows and Territories in Columns, I realized some products have "outlier like" GP%. These products have zero revenue as they were given to the customer as a freebie in a large contract which throws off my analysis on how profitable the product category is.

Would you recommend that I remove outliers and solely calculate based on 'normally' priced products? And second, I am also suppose to analyze large order (contain many products), how do you suggest I do that (especially with consideration to question 1)??


Can anyone help with a box and whisker related graph?
I can do the 50% quartile method of sorting this out but for a piece of research I need to study the middle 75% and exclude the top and bottom 12.5% of outliers only.
I tried the percentile approach to calculate the top and bottom 12.5% but this of course calculates figures that don't match up to actual data points, unlike the quartile method and median which actually do match up to the data points. As i need my box graph to match up to all the data points in my presentation the percentile approach doesn't work for me.
Does anyone have any idea how this can be done?

I'm attempting to do my first forecast, but I seem to be failing miserably. When I use the FORECAST function all of the forecasted totals are much too high. I

Is there a way to exclude outliers or use a moving average..... or am I just using the function completely wrong?

Any pointers would be greatly appreciated, when it's all said and done I would love to take the previous 6 months of processed data and use that to forecast the processed amounts for the next 6 months... but as it stands now these numbers won't be nearly accurate enough.

Hello all,

I'm trying to get a standard deviation of data range, excluding the outliers. I'm excluding

I have a set of data containing outliers based on a median and mean. The data is all in one column and the median and mean are each calculated in one cell via the average and mean formulas. in another column, if a value is an outlier i have it marked as "high" "low" or "" depending on whether its high, low, or not an outlier. What i would like to do is have the mean and median functions which currently look like =AVERAGE(A4:A32) exclude data that has "high" or "low" in the same row B column.


I can't see the forest through the trees on this one. My spreadsheet is designed to calculate waiting periods between each step of a multi-step process, i.e. Person A makes an appointment with GP, date of appointment, waiting time? Referral to specialist, date of appointment, waiting time? Diagnosis, date of procedure, waiting time. I am sure you get the drift.

My waiting time calculations utilise the NETWORKDAYS function - 1 and eliminate outliers created by empty cells with =IF(AND(F16

You guys are geniuses so hopefully someone has a good idea of how I can save a lot of time.

I have one worksheet with sales history per product by day. (dates are collumns, products are rows, qtys in the middle, all are values, not formulas). (I'll be using the Excel document as a workbook for future data, too, so whatever I create needs to work no matter what I paste into this raw data sheet.)

Then, I created a worksheet that pulls over the data from the first sheet, but also adds a row of trendline... Averages the 14 previous days' qtys. (so the trendline starts on day 15.)

I would like these numbers to paste back onto the first sheet, overwriting the existing data.

- Every time i do this, there may be up to 500 products, so cutting & pasting manually isn't an option.

- Some of the data points are 0 in the raw data sheet. (and some outliers are way high) Ideally, these would not be calculated in the average/ trendline.

- I could calculate the std dev of the data for each product & for hte average formula put in an IF stmt like if the value is more than 2 stdev of mean not to calculate it? Not sure how to write that formula.

- I could create a macro that will take the average data & paste it over the raw data. (My concern here is that the products' data don't all start on the same day. Also don't knwo how the macro would know to put it in the right day.)

Help is greatly appreciated! Have a good night!

Hi there,
I have a worksheet containing data in columns. For each column, I would like to calculate the standard deviation of the mid-95% of the values i.e. 2.5th percentile to 97.5th percentile (to reduce the effect of "outliers").

So first of all I calculate the values of the 2.5th and 97.5th percentiles:

A102 (2.5th percentile) = PERCENTILE(A1:A100,0.025)
A103 (97.5th percentile) = PERCENTILE(A1:A100,0.975)

Then using an array formula shamelessly plagiarized from this very website, I calculate the standard deviation of the values that lie in that range:


Does excel have a calculation for Winsorising or Winsorization

It's a statistical calculation to round up or down outliers vs trimming or removing them from the data pool.


I'm having a problem with my boxplots and they don't seem to show the correct position of the box itself in regard to outliers and Q1 & Q3 etc. Also I would like to narrow them down a bit if possible.

If anyone would also like to critique my other charts ie have I done them properly I would be very grateful to know.


A lot of times I will have a table of financial ratios wherein there is a median value calculated as the last column. We will sometimes exclude values in the table if they are outliers. Our convention is to bold those items and then exclude that cell from the formula.

What I would like to be able to do is select the median formulas and then run a macro that would take that selection, look at the formulas in the selection and then select all of the cells that the formulas take into consideration when calculating the median values.

I would then be able to very quickly see that all of the formulas were updated properly. You can go down the list of median values manually, but it is cumbersome with a long list.

The code I have now is below. What I am struggling with is a way to determine the range the median formulas reference and then how to select that total range.

Any help with this would be appreciated. Maybe there is an easier way to accomplish the same thing.

Sub ShowRanges()
Dim rngCell As Range
Dim rngFormulaCellsOnly As Range,
Dim rngResults As Range

' Select only cells with formulas.
Set rngFormulaCellsOnly = Selection.SpecialCells(xlCellTypeFormulas)

' Loop through all the currently selected cells that contain formulas.
For Each rngCell In rngFormulaCellsOnly

' Look at formual in current cell and add to rngResults
' This is the part I am struggling with.

Next rngCell

' Select the cells

End Sub


I have a dataset which contains a few outliers. Therefore, charts are distorted. It would help a lot if I coulde break the graph in two to create a graph of which the y-axis runs from e.g. 0 to 200, then breaks, and continues from 600 up... is this possible?

Thanks in advance,


I'm trying to generate a number of "average" columns for a group of stocks (based on data series for each stock), but I want to have the ability to change which stocks are in the average by simply changing a "Y" to "N". You'll see in columns C through J of the attached spreadsheet, I would like have an average of the individual stock data, but only for:
A) Those stocks I select (cells C5-C9); and
B) Starting as of the dates found with each stock (e.g. cell T2)

It would also be great if, when computing averages, the calculation would exclude significant outliers such as data that is more than 2 standard deviations away from the average.

Any help you can provide is greatly appreciated.

Second Gen v1.6.xlsx

I have a workbook with multiple sheets. I have one sheet that contains a list of data to exclude. Another tab contains a a data dump and calculations. I want this spreadsheet to search for info in the "exclude data" tab and if it finds a match in the data dump tab i want it to exclude this data.

However, when I try to use this formala "=IF(($A1422='Exclude Data'!$B$2:$B$76)*($D1422='Exclude Data'!$F$2:$F$76)*(G1422='Exclude Data'!$E$2:$E$76),"",Data!G1422)"

It works fine unless there are similar account numbers in 'Exclude Data' Column A. For example, account 1 may appear multiple times and the first in the sequence is identified however, the second and there after are ignored.

How can I get around this???