|
Excessive memory usage in PHP5 FCGI |
Posted by Jan on May-06-2011 18:09 |
|
When using Chartdirector 5000002 with PHP 5.3.3, I found it uses a lot of/too much memory.
This is a problem when using PHP in web server mode (e.g. with Fast-CGI) when the individual PHP processes are not terminated at the end of a request but will be reused for subsequent requests.
For example, when using the lighttpd web server with PHP in Fast-CGI mode, the PHP children will be reused for x (a configurable variable) requests.
The PHP children processes will survive x requests each and will be reused for subsequent requests until x has been reached. This has the advantage that the web server does not need to create new processes on each request but can reuse existing ones, thus saving time.
It also means the PHP children processes might be around for a long time and use memory.
When using Chartdirector to create many XYCharts in one PHP children process, a lot of memory might be used inside the process. The problem is that this memory is not freed completely at the end of the request, and the PHP child process will still have a lot of memory allocated to it.
If there are many PHP children processes around having a lot of memory allocated, this might eat up the complete memory of the web server.
To reproduce the issue, simply run this code in a script on the web server (note: do not use CLI PHP or "normal" CGI PHP because the issue will not be reproducable there, also note that this might not run on Windows because it uses the ps command):
<?php
print "<pre>";
$pid=getmypid();
$cmd="ps u -p ".$pid;
print "pid is ".$pid."\\n";
print "command is ".$cmd."\\n";
print "\\nbefore:\\n";
print shell_exec($cmd)."\\n";
require "phpchartdir.php";
$count=100000;
for ($j=0;$j<$count;$j++) {
new XYChart(100,100);
}
print "\\nafter:\\n";
print shell_exec($cmd)."\\n";
print "\\nnormal exit\\n";
print "</pre>";
This will print the current process id (pid), execute ps for the process, create 100,000 charts and execute ps again.
Example output might be:
pid is 27910
command is ps u -p 27910
before:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
nobody 27910 0.0 0.2 69552 7152 ? S 11:26 0:00 php-5.3.3/bin/php-cgi
after:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
nobody 27910 1.1 27.3 773520 708972 ? S 11:26 0:12 php-5.3.3/bin/php-cgi
As you can see, the resident set size (RSS) is 7,152 kb at the first invocation of ps (before creation of charts) and 708,792 kb at the second invocation of ps (after creation of charts).
After the PHP script terminates and PHP has performed its internal garbage collection, manually calling ps for the process on the command line returned these values:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
nobody 27910 1.3 4.3 175680 112584 ? S 11:26 0:17 php-5.3.3/bin/php-cgi
This means that process still allocated 112,584 kb of memory, which is too much from my point of view.
Overall, memory usage for the process went from 0.2% to 27.3% during script execution, but it did not return to 0.2% after script termination. There is still too much memory allocated, and this will cause problems if there are many such requests. About 20 such child processes might consume the server's total memory.
I think this should be investigated.
I tried to patch the PHP Chartdirector API myself a bit (file attached), and this resulted in some better memory usage values:
pid is 27912
command is ps u -p 27912
before:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
nobody 27912 0.0 0.2 69552 7152 ? S 11:26 0:00 php-5.3.3/bin/php-cgi
after:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
nobody 27912 0.4 0.4 71624 10668 ? S 11:26 0:07 php-5.3.3/bin/php-cgi
After manual execution of ps after script end:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
nobody 27912 0.4 0.4 71108 10504 ? S 11:26 0:07 php-5.3.3/bin/php-cgi
I modified the API to not rely on the autodestroy and shutdown functions, but used PHP5 destructors, which will be invoked automatically when a PHP5 object gets destroyed. These destructors call the existing __del__() methods.
This seems to work, however, I have no idea whether the patch does break anything. Especially I think it might not be PHP4 compatible, but PHP is hopelessly outdated (stable PHP5 was released in 2005) and should not be supported further from my point of view.
Can you please have a look whether the Chartdirector API can be improved in order to use less memory when used in PHP5?
If you should have any questions, please let me know.
Thank you and best regards
Jan
|
Re: Excessive memory usage in PHP5 FCGI |
Posted by Peter Kwan on May-06-2011 23:49 |
|
Hi Jan,
In fact, in the vast majority of the case, "ChartDirector for PHP" is used in environments in which the PHP is reused for multiple requests. The most common configuration is SAPI, in which the process never dies (unless the server is shutdown), and in this environment, the PHP may be reused for millions of requests. A lot of people are also using FastCGI nowadays.
As far as I know, no matter the PHP interpreter is reused or not, at the end of each script, the PHP interpreter will call the shutdown functions. So if the PHP interpreter is reused to execute 100 scripts, at the end of each script, it will call the shutdown functions. In other words, the shutdown functions will be called 100 times in total - one for each script - even though the PHP interpreter process does not really terminate. See the PHP documentation:
http://php.net/manual/en/function.register-shutdown-function.php
So as far as your code does not create 100000 charts in the same script, it should not use an excessive amount of memory. If you create only 1 chart in a script, and run the script 100000 times (reusing the same PHP process), the PHP process should have called garbageCollector 100000 times, and at any instance, there will only be at most 1 chart in memory.
On the other hand, if you create 100000 charts in one script, it will need to have 100000 charts in memory, and then the memory will be released by PHP during garbageCollect. I am not sure how PHP internally work. May be the PHP intentionally does not release all the memory at the OS level (so the OS level call "ps" will not see the memory freed), but keep it in a "free memory pool" for "recycling" within the PHP process. This may explain why a portion of the memory is still not freed at the OS level.
For your information, PHP uses its own internal memory allocator. The PHP allocates large blocks of memory from the OS, then sub-allocates the memory to PHP variables in the scripts. When the script releases the memory, the PHP may not return all the memory to the OS. It may keep a portion of them to be reused in future script instances.
For web usage, it is unlikely that you needs to create 100000 charts in one PHP script instance. However, if you do need to create 100000 charts in one script instance for some reasons, you may not need to have them all in memory at once. So you may call "garbageCollector" to release the charts periodically.
You mentioned that using destructors to call __del__ seems to work to free memory, but not using garbageCollector. Would you mind to clarify how you call garbageCollector? I think for you code, to be comparable to the destructor method, you should call it inside the loop, like:
for ($j=0;$j<$count;$j++) {
new XYChart(100,100);
$garbageCollector();
}
If you call garbageCollector at the end of the script, it should have no effect, as the garbageCollector will automatically be called anyway after the script ends.
ChartDirector does not use destructors as destructors are not supported in PHP 4. For backwards compatibility purposes, we still need to support PHP 4. Also, as explained above, we think for realistic scripts, using garbageCollector will not consume any more memory than using destructors.
Would you mind to try to test the following scenario to see if really a large amount of memory is used?
(a) Create a script that generates 1 or a few charts. Then run the script for a large number of times (eg. 100000 times), and see if the memory usage increases indefinitely without limit.
(b) Try to put the garbageCollector in the for loop, and see if still a large amount of memory is used.
Regards
Peter Kwan |
Re: Excessive memory usage in PHP5 FCGI |
Posted by Jan on May-09-2011 16:00 |
|
Hi Peter,
thanks for your instant response to this.
I know PHP uses its own memory allocator. It will keep some of the allocated memory blocks available for subsequent requests. Even if the memory is not actively used by the PHP child process when the request is terminated, it is still allocated and unavailable from an OS point of view. This is no memleak, because the memory will be reused later if the child process serves other requests and needs to allocate memory again.
PHP has its own garbage collection for variables (and objects thus charts). Memory might be freed at any time the PHP engine thinks its safe and sensible.
Using a shutdown function to free memory has the disadvantage that it will happen at the very end of the request only (whereas the internal gc might run before that). That means if a lot of memory was allocated in the request, it will stay allocated for a potentially long time (the duration of the request), and worse, during the request, PHP cannot free any resources allocated for chart objects even if they are no more actively used. This is because the chart objects are still referenced by a global variable used in the garbageCollection() function. The global reference will also prevent the PHP engine from cleaning up the chart objects itself, so one has to wait until script end for the shutdown function. Until then, each new chart created will need extra memory because PHP cannot reuse memory that previously was allocated for another chart in the same request.
Implementing PHP 5 object destructors fixed this issue because PHP was then able to reuse the memory temporarily allocated to chart objects that are not used anymore in the request. Using destructors, the engine was able to free memory in between so the total amount of memory used was way lower than when creating a lot of charts and freeing all memory at the end.
Calling garbageCollector() manually as you suggested does also work, however, from my point of view it is only a workaround when being compared to object destructors. This is because with the call to garbageCollector(),
- a change to the application is required at all places where charts are generated
- the global gc might free charts that are still actively used somewhere (if not used in a simple for loop as in the example), thus potentially resulting in fatal errors or segfaults if these objects are accessed later in the application
- the application finally needs to track itself which objects might be used and which are not
Object destructors and PHP's built-in garbage collector would exactly solve these issues without any modifications necessary to the application.
Which means to me that using regular destructors is the better way from both the memory and the convenience point of view.
Using the shutdown function and the final garbage collector will also work from the functional point of view. However, generating more than one chart in a script might use more memory than necessary from the time of chart creation until script end. The more charts one creates in a script, the worse it gets.
There does not seem to be a memleak though, just more memory is allocated than necessary which cumulatively might sum up to too much memory.
(btw as you mentioned "realistic scripts": we found the issue while conducting performance tests with a reporting application that was massively accessed in parallel during the tests - with high parallelism we ran into the issue easily even if each script does create only a few charts).
I understand you still need to support PHP4. But if you asked me, I'd say it would make most sense to create a separate API for PHP5 which could use all the advantages that are available with PHP5 (not only destructors but also namespaces, class constants etc. which means that there is no need to but all chartdirector constants and variables into the global namespace).
Is there a chance that a PHP5 API is released at some point? |
Re: Excessive memory usage in PHP5 FCGI |
Posted by Peter Kwan on May-10-2011 03:04 |
|
Hi Jan,
We have been thinking about using both destructors and garbageCollector for a while, since the code is still valid in PHP 4 even with destructors. (The destructors simply are not called in PHP 4, and in this case, garbageCollector can clean it up.) We just do not yet have the time for comprehensive testing.
Also, we think in most cases, the garbageCollector method is sufficient. For example, if each instance of a script is just creating a few charts, there is no need to call garbageCollector, as they will be called automatically anyway. In fact, we have never seen any web script that needs to call garbageCollector, because it is unlikely a web script can be used to create so many charts. (People calling garbageCollector usually are writing PHP batch scripts that runs from the command line and creates thousands of charts in a background process.)
Regards
Peter Kwan |
|