Look at this typical execution plan, and note how the sort takes 96% of your time. If you've worked with any serious data volume this can rapidly become expensive. The problem with this method, is that you're still guaranteed a full index scan and then a complete sort of the data. When someone get's fancy they add that you should really wrap NewID() in CheckSum(), you know, for performance! WHERE u.Id >= is an old question, but one aspect of the discussion is missing, in my opinion - PERFORMANCE. SELECT = ABS((CHECKSUM(NEWID()))) % Get the first row around that ID */ * Get a random number smaller than the table's top ID */ĭECLARE INT = ( SELECT MAX(Id) FROM dbo.Users) For top N rows, call the code below N times or generate N random numbers and use in an IN clause. More on method 3: Get the top ID field in the table, generate a random number, and look for that ID. Method 4, OFFSET-FETCH (2012+) > Only performs properly with a clustered index.Method 3, Best but Requires Code: Random Primary Key > Fastest, but won't work for negative numbers.Method 2, Better but Strange: TABLESAMPLE > Many gotchas & is not really random!.Method 1, Bad: ORDER BY NEWID() > Bad performance!.You can read the blog for more details.Ĥ ways to get a random row from a large table: įrom the above link, I have summarized the methods which you can use to generate a random id. Here is the explanation from Brent Ozar's blog. So, newId() might be ok in such cases where performance is not too bad & does not have a huge impact. It returns about 10k rows in about 3 seconds. I have an unimportant query which uses newId() and joins many tables.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |