Thursday, November 21, 2019

Rapid Query Store Growth

I recently had a problem that our Query Store size growth increased dramatically. Our initial size was 512MB and this was sufficient for 2 months. Then one day Query Store flipped to Read Only mode because it ran out of space. A quick manual expansion and flipping back to Read Write fixed the issue. I was then in the same situation within a couple days. Another quick fix had it running again and then the research into the rapid change to the growth rate started.

What I discovered is that a consultant on the application side found that in his test case the query plan for the actual parameter value was more efficient so he implemented forceliterals in Dynamics. What the unintended consequence was to eliminate plan reuse. We went from 1 plan for the query to 30,000! This was why Query Store was growing so rapidly.

What I found was that we had 51,212 different query plans. Of that total 80% of the query plans were from ONLY 4 statements!


--Find the total count of stored query plans
SELECT count(*)
FROM sys.query_store_query_text t 
JOIN sys.query_store_query q ON t.query_text_id = q.query_text_id 


 --Find the count of query plans by stemmed query text (queries not reusing plans)
select  left(txt.query_sql_text,50), count(distinct Pl.query_id) QueryCount, count(distinct Pl.plan_id) PlanCount
FROM sys.query_store_plan AS Pl  
JOIN sys.query_store_query AS Qry ON Pl.query_id = Qry.query_id  
JOIN sys.query_store_query_text AS Txt ON Qry.query_text_id = Txt.query_text_id
group by left(txt.query_sql_text,50)
having count(distinct Pl.query_id) > 1000
order by count(distinct Pl.plan_id) desc;  


Once I had the problem identified then I decided to help purge the Query Store.

First I needed a way to watch the Query Store to find if it filled and flipped to Read Only and when that happened.


--verify query store is collecting data
SELECT actual_state_desc, desired_state_desc, current_storage_size_mb,   
    max_storage_size_mb, readonly_reason, interval_length_minutes,   
    stale_query_threshold_days, size_based_cleanup_mode_desc,   
    query_capture_mode_desc  
FROM sys.database_query_store_options;  


--when did query store flip to readonly --time is in UTC
select top 1 last_execution_time 
from sys.query_store_query 
order by last_execution_time desc


Next, since the 4 queries accounted for 80% of my plans I decided to purge these regularly. I am using a SQL Job purging these queries every 10 minutes as the cursor for the delete is a bit slow.

--remove specific queries that are not reusing query plans
DECLARE @query_sql_text TABLE (sql_text varchar(100), QueryCount INT, PlanCount INT);
declare @query_sql_text_state varchar(100);

--count of query plans by stemmed query text
select  left(txt.query_sql_text,50), count(distinct Pl.query_id) QueryCount, count(distinct Pl.plan_id) PlanCount
FROM sys.query_store_plan AS Pl  
JOIN sys.query_store_query AS Qry ON Pl.query_id = Qry.query_id  
JOIN sys.query_store_query_text AS Txt ON Qry.query_text_id = Txt.query_text_id
group by left(txt.query_sql_text,50)
having count(distinct Pl.query_id) > 1000
order by count(distinct Pl.plan_id) desc;  

--loop through all query plans in @query_sql_text
While (Select Count(*) From @query_sql_text) > 0
Begin

Select Top 1 @query_sql_text_state = sql_text From @query_sql_text order by sql_text asc
print @query_sql_text_state

 --remove the query_ids from query store
 DECLARE @id int
 DECLARE adhoc_queries_cursor CURSOR
 FOR
  select top 100  Pl.query_id
  FROM sys.query_store_plan AS Pl  
  JOIN sys.query_store_query AS Qry ON Pl.query_id = Qry.query_id  
  JOIN sys.query_store_query_text AS Txt ON Qry.query_text_id = Txt.query_text_id
  where left(trim(txt.query_sql_text),50) = @query_sql_text_state 
  group by Pl.query_id
  order by min(Qry.last_execution_time) asc

 OPEN adhoc_queries_cursor;
 FETCH NEXT FROM adhoc_queries_cursor INTO @id;
 WHILE @@fetch_status = 0
  BEGIN
   PRINT @id
   EXEC sp_query_store_remove_query @id
   FETCH NEXT FROM adhoc_queries_cursor INTO @id
  END
 CLOSE adhoc_queries_cursor;
 DEALLOCATE adhoc_queries_cursor;

delete Top (1) From @query_sql_text where sql_text = @query_sql_text_state

End

Now that Query Store size is better managed and I can get back to tuning the poor performing queries.

No comments:

Post a Comment