What I discovered is that a consultant on the application side found that in his test case the query plan for the actual parameter value was more efficient so he implemented forceliterals in Dynamics. What the unintended consequence was to eliminate plan reuse. We went from 1 plan for the query to 30,000! This was why Query Store was growing so rapidly.
What I found was that we had 51,212 different query plans. Of that total 80% of the query plans were from ONLY 4 statements!
--Find the total count of stored query plans
SELECT count(*)
FROM sys.query_store_query_text t
JOIN sys.query_store_query q ON t.query_text_id = q.query_text_id
--Find the count of query plans by stemmed query text (queries not reusing plans)
select left(txt.query_sql_text,50), count(distinct Pl.query_id) QueryCount, count(distinct Pl.plan_id) PlanCount
FROM sys.query_store_plan AS Pl
JOIN sys.query_store_query AS Qry ON Pl.query_id = Qry.query_id
JOIN sys.query_store_query_text AS Txt ON Qry.query_text_id = Txt.query_text_id
group by left(txt.query_sql_text,50)
having count(distinct Pl.query_id) > 1000
order by count(distinct Pl.plan_id) desc;
Once I had the problem identified then I decided to help purge the Query Store.
First I needed a way to watch the Query Store to find if it filled and flipped to Read Only and when that happened.
--verify query store is collecting data
SELECT actual_state_desc, desired_state_desc, current_storage_size_mb,
max_storage_size_mb, readonly_reason, interval_length_minutes,
stale_query_threshold_days, size_based_cleanup_mode_desc,
query_capture_mode_desc
FROM sys.database_query_store_options;
--when did query store flip to readonly --time is in UTC
select top 1 last_execution_time
from sys.query_store_query
order by last_execution_time desc
Next, since the 4 queries accounted for 80% of my plans I decided to purge these regularly. I am using a SQL Job purging these queries every 10 minutes as the cursor for the delete is a bit slow.
--remove specific queries that are not reusing query plans
DECLARE @query_sql_text TABLE (sql_text varchar(100), QueryCount INT, PlanCount INT);
declare @query_sql_text_state varchar(100);
--count of query plans by stemmed query text
select left(txt.query_sql_text,50), count(distinct Pl.query_id) QueryCount, count(distinct Pl.plan_id) PlanCount
FROM sys.query_store_plan AS Pl
JOIN sys.query_store_query AS Qry ON Pl.query_id = Qry.query_id
JOIN sys.query_store_query_text AS Txt ON Qry.query_text_id = Txt.query_text_id
group by left(txt.query_sql_text,50)
having count(distinct Pl.query_id) > 1000
order by count(distinct Pl.plan_id) desc;
--loop through all query plans in @query_sql_text
While (Select Count(*) From @query_sql_text) > 0
Begin
Select Top 1 @query_sql_text_state = sql_text From @query_sql_text order by sql_text asc
print @query_sql_text_state
--remove the query_ids from query store
DECLARE @id int
DECLARE adhoc_queries_cursor CURSOR
FOR
select top 100 Pl.query_id
FROM sys.query_store_plan AS Pl
JOIN sys.query_store_query AS Qry ON Pl.query_id = Qry.query_id
JOIN sys.query_store_query_text AS Txt ON Qry.query_text_id = Txt.query_text_id
where left(trim(txt.query_sql_text),50) = @query_sql_text_state
group by Pl.query_id
order by min(Qry.last_execution_time) asc
OPEN adhoc_queries_cursor;
FETCH NEXT FROM adhoc_queries_cursor INTO @id;
WHILE @@fetch_status = 0
BEGIN
PRINT @id
EXEC sp_query_store_remove_query @id
FETCH NEXT FROM adhoc_queries_cursor INTO @id
END
CLOSE adhoc_queries_cursor;
DEALLOCATE adhoc_queries_cursor;
delete Top (1) From @query_sql_text where sql_text = @query_sql_text_state
End
Now that Query Store size is better managed and I can get back to tuning the poor performing queries.
No comments:
Post a Comment