Performance in converting large amounts of Documents
Posted: Thu Feb 14, 2008 5:38 pm
I'm using the automation API to convert a large amount of documents. I use a PHP script to dump them into a "queue" directory and then every couple of minutes I fire off a PHP script that first checks to see if there's another convert job still running, and if so exit, otherwise, get a list of files in this directory and start converting.
It seems to work fairly well. The problem is that many documents can't be converted for whatever reason (Too big, requires user input to complete the printing process, etc...). I've found a timeout that seems to balance the time to print long documents versus just wasting time waiting for documents that just won't convert.
So, my question is: How do I scale this process up? Can I run multiple queues and web server (apache) processes so I can make better use of my dual CPU cores (CPU is idle a lot of the time)? Or will the underlying windows printing subsystem be the choke point and make each job wait? Will there be a problem if 2 queues are trying to print the same type of document (so for example print2flash is opening 2 instances of word behind the scenes to convert both docs)?
Thanks for any insight/ideas.
It seems to work fairly well. The problem is that many documents can't be converted for whatever reason (Too big, requires user input to complete the printing process, etc...). I've found a timeout that seems to balance the time to print long documents versus just wasting time waiting for documents that just won't convert.
So, my question is: How do I scale this process up? Can I run multiple queues and web server (apache) processes so I can make better use of my dual CPU cores (CPU is idle a lot of the time)? Or will the underlying windows printing subsystem be the choke point and make each job wait? Will there be a problem if 2 queues are trying to print the same type of document (so for example print2flash is opening 2 instances of word behind the scenes to convert both docs)?
Thanks for any insight/ideas.