documentviewer slow response due to high amount of files in cache folder
Problem reported by Alessandro Rossetto - 1/14/2019 at 2:29 AM
Resolved
I'm experiencing slow response time using DocumentViewer.

I found that if I manually delete the DocumentCache folder the application performance return back to normal.
I have ~ 20.000 cached document with CacheMaxDays set to 7.

I reduced it to 1 now, but cannot test it in production right now because I need to wait the next maintenance window to install the update. But with some statistics I will still have ~ 4.000 cached documents per day.

The disk IO performance is not bad at all and doesn't get worst when the files number increases, so this makes me think the problem is in the cache implementation. 
Do you enumerate all the files in DocumentCache folder every time a document preview is requested?

Thanks
Alessandro Rossetto

4 Replies

Reply to Thread
1
Cem Alacayir Replied
Employee Post
Hi,
Yes, it’s probably caused by DocumentCache implementation because it needs to list the folders multiple times to find the cached file. DocumentCache uses a subfolder structure with unique IDs in their names but for reference purposes it also includes the original input file name such as:

 ~DOC Sample File.doc~zt2ple 

This is so that you can find and delete a specific cache file visually/manually in Windows File Explorer when you need, i.e. when you want to force re-conversion and re-caching of a specific input file. As the subfolder name is not exact (the original input file name part is not constant/known as the unique ID, same input files with different file names generate the same unique ID), DocumentCache needs to enumerate all subfolders and then find the subfolder which includes the unique id at the end of its name (ignores original input file name part). This is normally fast on physical file system (not so in Azure or S3) but I guess it may slow down with 20.000 subfolders even on physical disk. 

I think we should give up using non-exact subfolder names (subfolder name should be only the unique ID for fast finding of the subfolder as it was in early versions of DocumentUltimate) and then for reference purpose we can have a dummy file with the original input file name inside the subfolder (so that the user can search for that specific file and find the related cache subfolder).

We will optimize this and let you know soon.


0
Alessandro Rossetto Replied
Thank you very much, Cem.
I look forward to download the updated version.

Regards
Alessandro Rossetto
1
Cem Alacayir Replied
Employee Post Marked As Resolution
Hi Alessandro,
We have just released Version 4.5.5 - January 22, 2019 which improves stability and performance of DocumentCache:
 
  • Optimized cache folder structure so that access is very fast even when it's crowded (e.g. 20.000 files). This will also vastly improve access times when Amazon S3 or Azure Blob location is used for the cache (no more unnecessary cache folder listing so fewer requests to cloud storage). The existing cache folder will be migrated to the new structure automatically when this version first runs.

  • Cache folder can now be shared with multiple processes reliably as it will use distributed locking. Even processes on different machines will be handled via creating lock files within the cache folder. For example if you use a network share as the cache folder, different instances of the application will reliably share the cache (no unexpected "cache file not found" errors and ensuring the ongoing caching is completed only once).

  • Automatic cache trimming (clean up of expired items) is now a background task which is run at regular intervals specified via AutoTrimInterval property (default is 20 minutes). In the older version, it required a trigger of creating a new cache item. So now auto cache trim is more reliable and efficient.

  • Replaced maxDays constructor parameter with MaxAge property which is a TimeSpan so expiration can now be set also in hours, minutes or seconds and not only in days.
0
Alessandro Rossetto Replied
great!
Alessandro Rossetto

Reply to Thread