US Equity Buy/Sell Pressure and Retail Indicators Minute Bar Guide
US Equity Buy/Sell Pressure and Retail Indicators Minute Bar Guide
Version 1.0.0 (Feb 2025)
We are here to help you do great things with our market and reference data. For questions, feedback, and other concerns, you may reach our team of experts using the following contact information:
algoseek customer support
support@algoseek.com
(+1) 646 583 1832
algoseek sales
sales@algoseek.com
(+1) 646 583 1832
DATA ORGANIZATION AND FILE FORMAT 5
Effective Date | Version | Revision Type | Description |
Feb 4, 2025 | 1.0.0 | Release | An initial version of the dataset |
Identifying retail trades in Trade and Quote (TAQ) pricing data, specifically using the SIP (Securities Information Processor) feed, can be challenging because the SIP feed aggregates and normalizes data from multiple exchanges and doesn’t directly label whether a trade is retail or institutional. However, some indicators can be used to identify whether a trade is retail with high probability. For example, trade size, trade price, or the exchange where the trade happened can provide some information to identify retail trades.
The US Equity Buy/Sell Pressure and Retail Indicators Minute Bar dataset is built from top-of-book intraday quotes and trades for all listed stocks, ETNs, ETFs, ADRs, and funds from 15+ US exchanges and marketplaces. It provides a list of retail indicators and other analytics that can be helpful in such an analysis. The main approach to identifying retail trades is to analyze sub-penny pricing for FINRA/TRF trade reports (Tracking Retail Investor Activity).
algoseek Buy/Sell Pressure and Retail Indicators Minute Bar dataset is built from “as-is” tick data collected from live SIP feed algoseek’s co-located ticker plant servers in Equinix NY2 and NY4 data centers, connected with 10Gb fiber for low latency.
The Securities Information Processor (SIP) includes Tape A and Tape B covered by the Consolidated Tape Association (CTA) plan and Tape C covered by the Unlisted Trading Privileges (UTP) plan. The SIP links the US markets by processing and consolidating all protected bid/ask quotes and trades from every trading venue into a single and easily consumable data feed.
The SIP disseminates and calculates critical regulatory information, including the National Best Bid and Offer (NBBO) and Limit Up Limit Down (LULD) price bands, among other important regulatory information such as short sale restrictions and regulatory halts. In the highly fragmented world of US equities, the SIP is an easy way for people to get a view of the current state of the market.
Equity trades are executed on Public Exchanges (e.g., NASDAQ, BATS, NYSE, ARCA, etc.) and off the public exchanges in Dark Pools, Broker-Dealer internal crossing, and Block Trades.
Regulation National Market System (NMS) requires all trades to be reported. There are currently three FINRA Trade Reporting Facilities (TRF) affiliated with registered national securities exchanges and provide FINRA members with a mechanism for reporting transactions affected otherwise than on an exchange.
Regulation NMS allows up to 10 seconds after the Trade execution time for the trade report to be sent to an exchange’s TRF for publication. The delay can result in TRF Trade reports printed on the market data feed being out of the current NBBO.
A round lot (or board lot) is a normal unit of trading of a security, which currently is 100 shares of stock in the US. Any quantity less than 100 shares is referred to as an odd lot. Odd lots are not subject to the Regulation NMS rules requiring execution to be within the current NBBO. Broker-dealers send odd lots to the exchange paying the most rebate per share and not the best execution price. Odd lot executions can create unrealistic high/low trade prices in an OHLC bar.
algoseek Buy/Sell Pressure and Retail Indicators dataset provides continuous bars from pre-market opening (4 am ET), regular market hours, and post-market until the last exchange closes, which means there will always be a bar even if there are no events during the bar period.
If there are no changes to the Bid/Ask in the NBBO during a bar period, the current NBBO Bid/Ask from the previous bar period will be carried forward.
When trading and quoting activities are inactive, for example, during extended trading hours or with an illiquid stock, bid prices can be extremely low, and ask prices can be extremely high. An exchange can also send a bad price, for example, a stock has a bid of $12.05 then an exchange sends a bid of $212.05.
To make this Minute Bar dataset usable for illiquid stocks and ETFs/ETNs, algoseek filters out extreme quotes by the following two criteria:
Bid price < (0.05 x average price of last 10 days)
Ask price > (10 x average price of last 10 days)
The bars have trade volume separated into fields:
ExchangeVolume: Trades done on the listed exchanges
FinraVolume: Trades done in Dark Pools, internally by Broker-Dealers, or on an Over-the-Counter (OTC) market reporting to FINRA
The volume data is separated to make it easy to understand the trading in a bar period for either the public-listed exchanges or private non-public trading.
algoseek provides Equity market data in plain text CSV files. The first row of CSV file is a fixed header and then rows of data corresponding to individual bars. By default, data is organized into one file per symbol per trading day. For example, all trade and quote bars for ticker AAPL on Mar 3, 2020, are stored in one CSV file.
Due to their large data sizes, CSV files are gzip-compressed (with a csv.gz extension) with a compression ratio of about 8:1.
Table 1 demonstrates the full list of data fields in the Equity Buy/Sell Pressure and Retail Indicators file with sample contents for the AAPL symbol.
Table 1: Equity Buy/Sell Pressure and Retail Indicators Sample Data
Date | 20241204 | 20241204 | 20241204 |
Ticker | AAPL | AAPL | AAPL |
TimeBarStart | 09:30 | 09:31 | 09:32 |
OpenBarTime | 09:30:00.000000000 | 09:31:00.000000000 | 09:32:00.000000000 |
MinSpread | 0.01 | 0.01 | 0.01 |
MaxSpread | 0.25 | 0.1 | 0.09 |
ExchangeVolume | 782172 | 154750 | 85395 |
FinraVolume | 166665 | 98094 | 55095 |
TotalVolume | 948837 | 252844 | 140490 |
TotalTrades | 11932 | 3542 | 2207 |
TotalQuoteCount | 63761 | 37371 | 28399 |
ExchangeTradeCount | 4923 | 2258 | 1143 |
FinraTradeCount | 7009 | 1284 | 1064 |
OddLotTradeCount | 9959 | 2612 | 1616 |
OddLotTotalShares | 97184 | 39549 | 28124 |
RelativeSpreadAverage | 0.00019 | 0.00015 | 0.00015 |
TradeCumulDistributionToBid | 86012:90593:95163:118459:156132:224128:262617:275348:278183:413208 | 75988:83203:84476:92509:109844:139312:152189:158191:158462:240628 | 51715:52502:54761:60900:67483:97662:106461:108136:109726:134166 |
RetailTRFBuySize | 104 | 42 | 8 |
RetailTRFSellSize | 144 | 403 | 472 |
RetailOddLotBuySize | 38581 | 9969 | 6757 |
RetailOddLotSellSize | 24427 | 15101 | 9081 |
TRFRetailPress | 0.00026 | 0.00176 | 0.00342 |
OddLotPress | 0.06641 | 0.09915 | 0.11273 |
TRFRetailOddLotPress | 0.06667 | 0.10091 | 0.11615 |
OddLotTRFRetailRatio | 0.99608 | 0.98256 | 0.97058 |
TRFRetailBuySellRatio | 0.72222 | 0.10422 | 0.01695 |
OddLotBuySellRatio | 1.57944 | 0.66015 | 0.74408 |
TRFRetailOddLotBuySellRatio | 1.57442 | 0.6457 | 0.70815 |
RelNetTRFRetailFlow | -0.16129 | -0.81124 | -0.96667 |
RelNetOddLotFlow | 0.22464 | -0.20471 | -0.14674 |
RelNetTRFRetailOddLotFlow | 0.22313 | -0.21529 | -0.17085 |
TRFRetImbalance | 0.41935 | 0.09438 | 0.01667 |
OddLotImbalance | 0.61232 | 0.39765 | 0.42663 |
TRFRetOddLotImbalance | 0.61156 | 0.39236 | 0.41457 |
TRFRetSentiment | 0.10484 | 0.00305 | -0.09392 |
OddLotSentiment | 0.07704 | 0.05974 | 0.06461 |
TRFRetOddLotSentiment | 0.07373 | 0.05625 | 0.06072 |
Table 2 (below) provides the name, base event, default value, brief description, and data type for each data field (column) in the Equity Buy/Sell Pressure and Retail Indicators Minute Bar CSV file.
Table column “Missing” indicates a default behavior in case the data field value is not present or cannot be calculated. The column value “Never” means that the data field value is always present.
The table column “Base Event” indicates what type of events are included for data field calculation. Quote: bid/ask event, Trade-X: trades on the exchange, Trade-F: trades on FINRA/TRF, Trade: trades on both exchange and FINRA/TRF.
Table 2: Equity Buy/Sell Pressure and Retail Indicators CSV File Fields Schema
Field | Base Event | Type (Format) | Missing | Description |
TradeDate | - | string (yyyymmdd) | Never | Trading date in yyyymmdd format |
Ticker | - | string | Never | Symbol name |
TimeBarStart | - | string (time) | Never | Start time of the bar. For a minute bar, the format is HH:MM. For a second bar, the format is HH:MM:SS |
OpenBarTime | Quote | string (timestamp) | Never | Open time of the bar, for example, one minute bar: 11:03:00.000000000 |
MinSpread | Quote | decimal | Never | Minimum Bid-Ask spread size. This may be 0 if the market was crossed during the bar. If there is a negative spread due to a back quote, make it zero |
MaxSpread | Quote | decimal | Never | Maximum NBBO Bid-Ask spread in a bar |
ExchangeVolume | Trade-X | integer | 0 | Share volume on public exchanges only. That is number of shares traded Excluding FINRA/TRF reported trades, see field “FinraVolume” for FINRA trades. |
FinraVolume | Trade-F | integer | 0 | Number of shares traded reported by FINRA/TRF. Trades reported by FINRA are from broker-dealer internalization, dark pools, over-the-counter, etc. FINRA trades represent volume that is hidden or not publicly available to trade |
TotalVolume | Trade | integer | Blank | Total number of shares traded during the bar period from both public exchanges and off-exchange FINRA/TRF trades |
TotalTrades | Trade | integer | 0 | Total number of trades |
TotalQuoteCount | Quote | integer | Blank | Total count top-of-book Bid and Ask from public exchanges for bar period |
ExchangeTradeCount | Trade-X | integer | Blank | Total number of trades on public exchanges for bar period |
FinraTradeCount | Trade-F | integer | Blank | Total number of FINRA/TRF trades for bar period |
OddLotTradeCount | Trade-X | integer | Blank | Total number of Odd Lot trades during the bar period from public exchanges only |
OddLotTotalShares | Trade-X | integer | Blank | Total number of Odd Lot shares traded during the bar period from public exchanges only |
RelativeSpreadAverage | Quote Trade | decimal | Blank | The Relative Spread is the Bid/Ask spread relative to the midpoint price at time t for a trade. It shows how wide the spread is compared to the price. For each minute, the average of the Relative Spreads for each trade is calculated. See below “RelativeSpreadAverage” |
TradeCumulDistributionToBid | Quote Trade | string | Blank | Cumulative distribution volume of Trade price relative to the Bid during the bar period with 0 being trade at Bid and 1 being trade at Ask. Cumulative distribution created with percentage probabilities of 0:0.05:0.1: 0.20:0.40:0.60:0.80:0.90:0.95:1. See below “TradeCumulDistributionToBid” |
RetailTRFBuySize | Trade-F Quote | integer | Blank | Estimated number of shares that are Buy retail TRF order flow. That is trades outside of public exchanges reported as TRF. Retail trades are identified using trades executed sub-penny within a specific range. See the “RetailTRFBuySize” notes below with reference to the paper. |
RetailTRFSellSize | Trade-F Quote | integer | Blank | Estimated number of shares that are Sell retail TRF order flow. That is trades outside of public exchanges reported as TRF. Retail trades are identified using trades executed sub-penny within a specific range. See the “RetailTRFSellSize” notes below with reference to the paper. |
RetailOddLotBuySize | Trade-X Quote | integer | Blank | Estimated number of shares that are Buy retail order flow. Retail trades are identified using odd lot trades executed at a price on a specific side of the midpoint on public exchanges. See “RetailOddLotBuySize” below. |
RetailOddLotSellSize | Trade-X Quote | integer | Blank | Estimated number of shares that are Sell retail order flow. Retail trades are identified using odd lot trades executed at a price on a specific side of the midpoint on public exchanges. See “RetailOddLotSellSize” below. |
TRFRetailPress | Trade-F | decimal | Blank | The metric measures the proportion of TRF retail trades’ size to the total market volume |
OddLotPress | Trade-X Quote | decimal | Blank | The metric measures the proportion of odd lot (on public exchanges) trades’ size to the total market volume |
TRFRetailOddLotPress | Trade Quote | decimal | Blank | The metric measures the proportion of TRF retail plus odd lot (on public exchanges) trades’ size to the total market volume |
OddLotTRFRetailRatio | Trade Quote | decimal | Blank | This metric measures the odd lot (on public exchanges) volume compared to the total TRF retail plus odd lot (on public exchanges) volume |
TRFRetailBuySellRatio | Trade-F | decimal | Blank | This metric measures the TRF retail buy size to the TRF retail sell size |
OddLotBuySellRatio | Trade-X Quote | decimal | Blank | This metric measures the odd lot (on public exchanges) buy size to the odd lot (on public exchanges) sell size |
TRFRetailOddLotBuySellRatio | Trade Quote | decimal | Blank | This metric measures the TRF retail plus odd lot (on public exchanges) buy size to the TRF retail plus odd lot (on public exchanges) sell size |
RelNetTRFRetailFlow | Trade-F | decimal | Blank | The metric measures the relative net difference between TRF retail buy and sell sizes to total TRF retail volume |
RelNetOddLotFlow | Trade-X Quote | decimal | Blank | The metric measures the relative net difference between odd lot (on public exchanges) buy and sell sizes to the total odd lot (on public exchanges) volume |
RelNetTRFRetailOddLotFlow | Trade Quote | decimal | Blank | The metric measures the relative net difference between TRF retail plus odd lot (on public exchanges) buy and sell sizes to the total TRF retail plus odd lot (on public exchanges) volume |
TRFRetImbalance | Trade-F | decimal | Blank | This metric highlights the imbalance between TRF retail buy and sell sizes |
OddLotImbalance | Trade-X Quote | decimal | Blank | This metric highlights the imbalance between odd lot (on public exchanges) buy and sell sizes |
TRFRetOddLotImbalance | Trade Quote | decimal | Blank | This metric highlights the imbalance between TRF retail plus odd lot (on public exchanges) buy and sell sizes |
TRFRetSentiment | Trade-F | decimal | Blank | A sentiment indicator that measures TRF retail pressure cycles based on buy/sell imbalances. For example, 1-week cycles. |
OddLotSentiment | Trade-X Quote | decimal | Blank | A sentiment indicator that measures odd lot (on public exchanges) pressure cycles based on buy/sell imbalances. For example, 1-week cycles. |
TRFRetOddLotSentiment | Trade Quote | decimal | Blank | A sentiment indicator that measures TRF retail plus odd lot (on public exchanges) pressure cycles based on buy/sell imbalances. For example, 1-week cycles. |
The Buy/Sell Pressure and Retail Indicators Minute Bar dataset covers the entire trading day from the start of pre-market trading to the end of after-hours trading (ET time):
Pre-Market Hours: 04:00:00 to 09:30:00 (excluding)
Market Hours: 09:30:00 to 16:00:00 (excluding)
Post-Market Hours: 16:00:00 to 20:00:00
Note: Occasionally, minute bars are extended several minutes past 20:00.
The stock market is closed for trading on most US holidays. For reference, algoseek publishes a list of historical holidays which is available at s3://us-equity-market-holidays/holidays.csv (direct download link: https://us-equity-market-holidays.s3.amazonaws.com/holidays.csv).
Markets sometimes close early at 13:00:00 on the day before holidays such as Independence Day and Thanksgiving. You can download algoseek’s early close date and time list from AWS S3 storage at s3://us-equity-market-holidays/earlycloses.csv (or use a direct link: us-equity-market-holidays.s3.amazonaws.com/earlycloses.csv).
Time Bar Start Format: One-second bar 13:03:01 is from time greater than 13:03:01 to less than 13:03:02. One-minute bar 11:04 is from time greater than 11:04 to less than 11:05.
Empty Fields: an empty field has no value and is “Blank.” For example, “ExchangeTradeCount” and there are no trades during the bar period. The field “TotalTrades” measuring the total number of trades in a bar will be “0” if there are no Trades. Look at the “Missing” column above for each field.
Spread Validation Rules: For fields that include spreads (for example, VolumeWeightSpread), the spread needs to be excluded when it is clearly incorrect.
Pre-Market: Exclude when a bid or ask is further away than 30% of the midpoint. For example, if stock is 100 then
bid >= (0.7 * 100) = 70 and Ask <= (1.3 * 100) = 130
Regular Market: Exclude when a bid or ask is further away than 10% of the midpoint. For example, if stock is 100 then
bid >= (0.9 * 100) = 90 and Ask <= (1.1 * 100) = 110
Use dynamic calculation to move from Pre-Market to Regular Market. When 3 or more Bid/Ask spreads are within 10% of the midpoint after the start of the Regular market (9:30:00), continue to use 10% of the midpoint. After 20 NBBO updates (consider one NBBO update to be two rows with an update for Bid and Ask), move to 10%.
Post Market: Exclude when a bid or ask is further away than 30% of the midpoint. Start immediately after market Close (4 pm or 1 pm for half-days)
Always exclude spread for
The field “SpreadValidTime” has the total milliseconds for each bar period showing the total number of milliseconds that meet valid criteria based on the requirements listed here.
RelativeSpreadAverage: The relative spread size is the Bid/Ask spread relative to the midpoint price at time t for a trade. It shows how wide the spread is compared to the midpoint price. For each minute, the average of the relative spread sizes for each trade is calculated.
For each trade in the bar:
t = time of a trade
b = National Best Bid (NBBO) at t (use last NBBO before time t)
a = National Best Ask (NBBO) at t (use last NBBO before time t)
m = midpoint price at t, m = (b + a) / 2
rs = relative spread at t is max(a - b, 0) / m (max function is used as NBBO spread may be 0 or inverted at times)
RelativeSpreadAverage = sum(rs) / count(rs)
TradeCumulDistributionToBid: A distribution of trades relative to the bid and offer. Calculate the distance of each trade to the Bid with 0 executed at the Bid and 1 executed at the Ask. Then calculate the cumulative distribution of the distance from the Bid using the below probabilities.
0, 0.05, 0.1, 0.20, 0.40, 0.60, 0.80, 0.90, 0.95, 1
This shows if there was pressure on either side of the bid offer and if ‘retail trade’ moves away from the paper – there will be a potential way to recalibrate it without recalculating everything in the history again. As an example: say 1000 shares are traded in the given bin. 100 at a bid, 400 at mid, and 500 at the offer. The distribution above will look as follows
0 100
0.05 100
0.1 100
0.2 100
0.4 100
0.6 500
0.8 500
0.9 500
0.95 500
1 1000
RetailTRFBuySize, RetailTRFSellSize: Identify Retail Buy and Sell trades executed internally by a Broker-Dealer or wholesale to dark-pool based on their sub-penny pricing as a TRF reported trade. These indicators are based on the research paper “Tracking Retail Investor Activity” by E. Boehmer, Ch. M. Jones, X. Zhang, and X. Zhang published in the Journal of Finance (Tracking Retail Investor Activity).
The field calculations are based on the following description from the paper:
Based on these institutional arrangements, identifying transactions initiated by retail customers is fairly straightforward. Transactions with a retail seller tend to be reported on a TRF at prices that are just above a round penny due to the small price improvement, while transactions with a retail buyer tend to be reported on a TRF at prices just below a round penny. To be precise, for all trades reported to a FINRA TRF (exchange code “D” in TAQ), let Pit be the transaction price in stock i at time t, and let Zit ≡ 100 * mod(Pit, 0.01) be the fraction of a penny associated with that transaction price. Zit can take any value in the unit interval [0,1). If Zit is in the interval (0,0.4), we identify it as a retail sell transaction. If Zit is in the interval (0.6,1), then the transaction is coded as a retail buy transaction. To be conservative, transactions at a round penny (Zit = 0) or near the half-penny (0.4 ≤ Zit ≤ 0.6) are not assigned to the retail category.
RetailOddLotBuySize, RetailOddLotSellSize: Identify Retail Buy and Sell trades from odd lots transactions executed on public exchanges. These indicators are based on a comparison of trade price with the NBBO midpoint. If the trade price is higher than the midpoint the transaction is identified as a buy transaction. If it is lower than the midpoint the transaction is identified as a sell transaction.
RP = (RetailTRFBuySize + RetailTRFSellSize) / TotalVolume
Def: This metric measures the odd lot (on public exchanges) volume compared to the total TRF retail plus odd lot (on public exchanges) volume.
OLRR = (RetailOddLotBuySize + RetailOddLotSellSize) / (RetailTRFBuySize + RetailTRFSellSize + RetailOddLotBuySize + RetailOddLotSellSize)
Def: This metric measures the relative net difference between TRF retail plus odd lot (on public exchanges) buy and sell sizes to total TRF retail plus odd lot (on public exchanges) volume. A positive RNRODF suggests buying pressure from retail investors, while a negative RNRODF indicates selling pressure.
RNRODF = ((RetailTRFBuySize + RetailOddLotBuySize) - (RetailTRFSellSize + RetailOddLotSellSize)) / (RetailTRFBuySize + RetailTRFSellSize + RetailOddLotBuySize + RetailOddLotSellSize)
RI = RetailTRFBuySize / (RetailTRFBuySize + RetailTRFSellSize)
OLI = RetailOddLotBuySize / (RetailOddLotBuySize + RetailOddLotSellSize)
Def: A sentiment indicator that measures TRF retail pressure 1-hour cycles based on buy/sell imbalances.
RS = MovingAverage(Relative Net TRF Retail Flow)
Older versions of Excel will automatically convert the TimeBarStart field into an Excel format timestamp, but this fails when TimeBarStart is HHMMSSmmm (millisecond) or HHMMSSmmmiiinnn (nanosecond). For timestamp with the nanosecond (millisecond) format, import the data using the Excel “From Text” option and set the data type for column “TimeBarStart” to “Text”, so Excel does not automatically try to convert it.
This section describes logic for minute bar calculations based on events from the Trade and Quote dataset. Please also refer to the Equity Trade and Quote Guide for more details on data fields and condition flags used.
There is a separate logic for the Standard Bars dataset and Bars with FINRA/TRF and Odd Lots Excluded.
You should also exclude any event with one or more flags listed in Table 2.
Table 2: Flags for Trade and Quote Events to be Excluded During Bar Calculations
Trade Events | Quote Events | ||
Bit Mask Position | Flags | Bit Mask Position | Flags |
14 | tOutOfSequence | 3 | qClosing |
20 | tAveragePrice | 4 | qNewsDissemination |
22 | tPriceVariation | 5 | qNewsPending |
23 | tRule155 | 6 | qTradingRangeIndication |
24 | tOfficialClose | 7 | qOrderImbalance |
25 | tPriorReferencePrice | 13 | qResume |
26 | tOfficialOpen | ||
You should only include events with one or more flags listed in Table 3. If the event has any of the exclude flags enabled, it is not included. If the event does not contain any flags from the include list, it is not included in bar calculations.
Table 3: Flags for Trade and Quote Events to be Included During Bar Calculations
Trade Events | Quote Events | ||
Bit Mask Position | Flags | Bit Mask Position | Flags |
0 | tRegular | 0 | qRegular |
1 | tCash | 1 | qSlow |
2 | tNextDay | 2 | qGap |
5 | tIntermarketSweep | 11 | qOpeningQuote |
6 | tOpeningPrints | 21 | qFastTrading |
7 | tClosingPrints | ||
10 | tFormT | ||
13 | tExtendedHours | ||
21 | tCross | ||
29 | tTradeThroughExempt | ||
31 | tOddLot | ||
Additionally, you should filter out test quote events using the following approach:
MinPrice = 0.05 * AveragePrice
MaxPrice = 10 * AveragePrice
MinPrice = 0.03
MaxPrice = 19998
If a Quote price is lower than MinPrice or higher than MaxPrice - the event is excluded.
Note: We do not recommend applying price filtering for Trade events.