Quantcast

Global Statement Compilation in HSQLDB

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Global Statement Compilation in HSQLDB

Daniel Walsh
Hi Devs,

I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 

Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 

I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.

This had a massive benefit to performance because the statements being compiled are quite complex.

Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?

Thanks,
Daniel Walsh
Software Dev 

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Global Statement Compilation in HSQLDB

Fred Toussi-2
There was a global Prepared Statement cache in version 1.8. This was too complex as it had to keep track of closed sessions and statements closing on some sessions and not on others. The cache was moved to individual sessions in version 2.0.
 
You can use a connection pool. There are two options. The org.hsqldb.jdbc.JDBCPool or a generic JDBC pool.
 
The org.hsqldb.Session class has a resetSession() method that calls statementManager.reset(). The JDBCPool calls the resetSession() method. Therefore you need to modify either class to not reset the StatementManager. Third party pools may or may not be resetting the HSQLDB session upon reuse, so you need to check.
 
Fred Toussi
 
 
 
On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote:
Hi Devs,
 
I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 
 
Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 
 
I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.
 
This had a massive benefit to performance because the statements being compiled are quite complex.
 
Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?
 
Thanks,
Daniel Walsh
Software Dev 
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Global Statement Compilation in HSQLDB

Daniel Walsh
I'll investigate using a Connection Pool , i'm currently using the apache BasicDataSource for connection retrieval.

However if every time I retrieve from the pool if resetSession is called doesn't that clear the statement manager as you indicate, wiping out any performance gains of persisting compiled PreparedStatements between separate usages of the connection? Essentially I have many concurrent users hitting the same statement structures with different values for ? , ? , ? parameters. The statements are programatically generated, and quite finite, between 100-5000 thousand possible statements depending on the scale of the configuration. 

For this reason at the level of the class ParserDQL I added a very basic hack ( since this class is called in a stateful way , where fields are set on it - so i was a bit careful ) , I added a private static final ConcurrentMap of SQL String -> StatementQuery , on the basis that if the sql strings are identical their compiled queries must also be identical, I couldnt think of a scenario where they would need to be destroyed and recreated , if the DDL changes then the statement would also need to change by necessity, therefore the value in the cache would never get used again .... so except for the glaring memory leak at this moment it would work in principle. 

So when compileCursorSpecification is called on this class , from digging through the code I knew the sqlString was defined at this time, so i do a simple map lookup and return if found, or compile , push to map , return if not.
I identified the StatementQuery as the only costly part, so I cache that state-less element and cannot really see any drawbacks like you mentioned regarding closing sessions etc. 

Thanks for discussing this matter in great depth btw , I'm a large fan of HSQLDB and have used it in projects @ my company for several years.

Daniel 



On 26 May 2016 at 20:48, Fred Toussi <[hidden email]> wrote:
There was a global Prepared Statement cache in version 1.8. This was too complex as it had to keep track of closed sessions and statements closing on some sessions and not on others. The cache was moved to individual sessions in version 2.0.
 
You can use a connection pool. There are two options. The org.hsqldb.jdbc.JDBCPool or a generic JDBC pool.
 
The org.hsqldb.Session class has a resetSession() method that calls statementManager.reset(). The JDBCPool calls the resetSession() method. Therefore you need to modify either class to not reset the StatementManager. Third party pools may or may not be resetting the HSQLDB session upon reuse, so you need to check.
 
Fred Toussi
 
 
 
On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote:
Hi Devs,
 
I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 
 
Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 
 
I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.
 
This had a massive benefit to performance because the statements being compiled are quite complex.
 
Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?
 
Thanks,
Daniel Walsh
Software Dev 
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers



------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Global Statement Compilation in HSQLDB

Fred Toussi-2
If you limit statement caching to PreparedStatement the best place for the cache would be the StatementManager.
 
The StatementManager compile() method checks if the SQL and its Statement object already exists. The Statement is removed when it is closed. All Statements are cleared when the connection and the session are closed. So for your usage, you can put a non-static ConcurrentMap in org.hsqldb.Database and each time a statement is compiled, you add to the ConcurrentMap and each time the lookup in compile() does not find the SQL and its statement, you try retrieving from the Concurrent map and let the compile() method perform its other checks. Note there is a different map for each schema (same SQL compiled with a different current schema can target different tables). Therefore you need  a map of schema name to ConcurrentMap to be precise. This method doesn't need changes to resetSession and works with or without a connection pool.
 
Fred
 
On Fri, May 27, 2016, at 09:49, Daniel Walsh wrote:
I'll investigate using a Connection Pool , i'm currently using the apache BasicDataSource for connection retrieval.
 
However if every time I retrieve from the pool if resetSession is called doesn't that clear the statement manager as you indicate, wiping out any performance gains of persisting compiled PreparedStatements between separate usages of the connection? Essentially I have many concurrent users hitting the same statement structures with different values for ? , ? , ? parameters. The statements are programatically generated, and quite finite, between 100-5000 thousand possible statements depending on the scale of the configuration. 
 
For this reason at the level of the class ParserDQL I added a very basic hack ( since this class is called in a stateful way , where fields are set on it - so i was a bit careful ) , I added a private static final ConcurrentMap of SQL String -> StatementQuery , on the basis that if the sql strings are identical their compiled queries must also be identical, I couldnt think of a scenario where they would need to be destroyed and recreated , if the DDL changes then the statement would also need to change by necessity, therefore the value in the cache would never get used again .... so except for the glaring memory leak at this moment it would work in principle. 
 
So when compileCursorSpecification is called on this class , from digging through the code I knew the sqlString was defined at this time, so i do a simple map lookup and return if found, or compile , push to map , return if not.
I identified the StatementQuery as the only costly part, so I cache that state-less element and cannot really see any drawbacks like you mentioned regarding closing sessions etc. 
 
Thanks for discussing this matter in great depth btw , I'm a large fan of HSQLDB and have used it in projects @ my company for several years.
 
Daniel 
 
 
 
On 26 May 2016 at 20:48, Fred Toussi <[hidden email]> wrote:
There was a global Prepared Statement cache in version 1.8. This was too complex as it had to keep track of closed sessions and statements closing on some sessions and not on others. The cache was moved to individual sessions in version 2.0.
 
You can use a connection pool. There are two options. The org.hsqldb.jdbc.JDBCPool or a generic JDBC pool.
 
The org.hsqldb.Session class has a resetSession() method that calls statementManager.reset(). The JDBCPool calls the resetSession() method. Therefore you need to modify either class to not reset the StatementManager. Third party pools may or may not be resetting the HSQLDB session upon reuse, so you need to check.
 
Fred Toussi
 
 
 
On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote:
Hi Devs,
 
I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 
 
Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 
 
I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.
 
This had a massive benefit to performance because the statements being compiled are quite complex.
 
Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?
 
Thanks,
Daniel Walsh
Software Dev 
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Global Statement Compilation in HSQLDB

Daniel Walsh
I initially added the Statement Cache to the Statement Manager class, but I realized that sharing a Statement object concurrently between many connections would mean mixing state between concurrent users causing functional errors. To do that correctly I would need to implement a global cache of statements that never get "used" , but rather on retrieval get cloned , so their state isn't shared, and then that Statement follows a regular lifecycle, being returned to the connection, pooled as normally done, closed and destroyed as usual. But the object in the cache would be retained, un-modified after creation , used as a template so-to-speak. 

And that whole solution as described above seemed to be less performant than just caching the non-modifiable aspect of the Statement objects and let them be constructed natively as is currently done. 

Am I correct in thinking that sharing statement objects across connections would cause data inconsistencies? It appears that way. 

Daniel 


On 27 May 2016 at 11:07, Fred Toussi <[hidden email]> wrote:
If you limit statement caching to PreparedStatement the best place for the cache would be the StatementManager.
 
The StatementManager compile() method checks if the SQL and its Statement object already exists. The Statement is removed when it is closed. All Statements are cleared when the connection and the session are closed. So for your usage, you can put a non-static ConcurrentMap in org.hsqldb.Database and each time a statement is compiled, you add to the ConcurrentMap and each time the lookup in compile() does not find the SQL and its statement, you try retrieving from the Concurrent map and let the compile() method perform its other checks. Note there is a different map for each schema (same SQL compiled with a different current schema can target different tables). Therefore you need  a map of schema name to ConcurrentMap to be precise. This method doesn't need changes to resetSession and works with or without a connection pool.
 
Fred
 
On Fri, May 27, 2016, at 09:49, Daniel Walsh wrote:
I'll investigate using a Connection Pool , i'm currently using the apache BasicDataSource for connection retrieval.
 
However if every time I retrieve from the pool if resetSession is called doesn't that clear the statement manager as you indicate, wiping out any performance gains of persisting compiled PreparedStatements between separate usages of the connection? Essentially I have many concurrent users hitting the same statement structures with different values for ? , ? , ? parameters. The statements are programatically generated, and quite finite, between 100-5000 thousand possible statements depending on the scale of the configuration. 
 
For this reason at the level of the class ParserDQL I added a very basic hack ( since this class is called in a stateful way , where fields are set on it - so i was a bit careful ) , I added a private static final ConcurrentMap of SQL String -> StatementQuery , on the basis that if the sql strings are identical their compiled queries must also be identical, I couldnt think of a scenario where they would need to be destroyed and recreated , if the DDL changes then the statement would also need to change by necessity, therefore the value in the cache would never get used again .... so except for the glaring memory leak at this moment it would work in principle. 
 
So when compileCursorSpecification is called on this class , from digging through the code I knew the sqlString was defined at this time, so i do a simple map lookup and return if found, or compile , push to map , return if not.
I identified the StatementQuery as the only costly part, so I cache that state-less element and cannot really see any drawbacks like you mentioned regarding closing sessions etc. 
 
Thanks for discussing this matter in great depth btw , I'm a large fan of HSQLDB and have used it in projects @ my company for several years.
 
Daniel 
 
 
 
On 26 May 2016 at 20:48, Fred Toussi <[hidden email]> wrote:
There was a global Prepared Statement cache in version 1.8. This was too complex as it had to keep track of closed sessions and statements closing on some sessions and not on others. The cache was moved to individual sessions in version 2.0.
 
You can use a connection pool. There are two options. The org.hsqldb.jdbc.JDBCPool or a generic JDBC pool.
 
The org.hsqldb.Session class has a resetSession() method that calls statementManager.reset(). The JDBCPool calls the resetSession() method. Therefore you need to modify either class to not reset the StatementManager. Third party pools may or may not be resetting the HSQLDB session upon reuse, so you need to check.
 
Fred Toussi
 
 
 
On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote:
Hi Devs,
 
I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 
 
Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 
 
I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.
 
This had a massive benefit to performance because the statements being compiled are quite complex.
 
Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?
 
Thanks,
Daniel Walsh
Software Dev 
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers



------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Global Statement Compilation in HSQLDB

Fred Toussi-2
In recent versions of HSQLDB, Statement objects are not generally mutable and can be shared concurrently. The only mutable aspect is the setCursorPropertiesRequest() method that is checked for in StatementManager.compile(). Statements that are out-of-date are also rejected by the same method.
 
Fred
 
On Fri, May 27, 2016, at 11:26, Daniel Walsh wrote:
I initially added the Statement Cache to the Statement Manager class, but I realized that sharing a Statement object concurrently between many connections would mean mixing state between concurrent users causing functional errors. To do that correctly I would need to implement a global cache of statements that never get "used" , but rather on retrieval get cloned , so their state isn't shared, and then that Statement follows a regular lifecycle, being returned to the connection, pooled as normally done, closed and destroyed as usual. But the object in the cache would be retained, un-modified after creation , used as a template so-to-speak. 
 
And that whole solution as described above seemed to be less performant than just caching the non-modifiable aspect of the Statement objects and let them be constructed natively as is currently done. 
 
Am I correct in thinking that sharing statement objects across connections would cause data inconsistencies? It appears that way. 
 
Daniel 
 
 
On 27 May 2016 at 11:07, Fred Toussi <[hidden email]> wrote:
If you limit statement caching to PreparedStatement the best place for the cache would be the StatementManager.
 
The StatementManager compile() method checks if the SQL and its Statement object already exists. The Statement is removed when it is closed. All Statements are cleared when the connection and the session are closed. So for your usage, you can put a non-static ConcurrentMap in org.hsqldb.Database and each time a statement is compiled, you add to the ConcurrentMap and each time the lookup in compile() does not find the SQL and its statement, you try retrieving from the Concurrent map and let the compile() method perform its other checks. Note there is a different map for each schema (same SQL compiled with a different current schema can target different tables). Therefore you need  a map of schema name to ConcurrentMap to be precise. This method doesn't need changes to resetSession and works with or without a connection pool.
 
Fred
 
On Fri, May 27, 2016, at 09:49, Daniel Walsh wrote:
I'll investigate using a Connection Pool , i'm currently using the apache BasicDataSource for connection retrieval.
 
However if every time I retrieve from the pool if resetSession is called doesn't that clear the statement manager as you indicate, wiping out any performance gains of persisting compiled PreparedStatements between separate usages of the connection? Essentially I have many concurrent users hitting the same statement structures with different values for ? , ? , ? parameters. The statements are programatically generated, and quite finite, between 100-5000 thousand possible statements depending on the scale of the configuration. 
 
For this reason at the level of the class ParserDQL I added a very basic hack ( since this class is called in a stateful way , where fields are set on it - so i was a bit careful ) , I added a private static final ConcurrentMap of SQL String -> StatementQuery , on the basis that if the sql strings are identical their compiled queries must also be identical, I couldnt think of a scenario where they would need to be destroyed and recreated , if the DDL changes then the statement would also need to change by necessity, therefore the value in the cache would never get used again .... so except for the glaring memory leak at this moment it would work in principle. 
 
So when compileCursorSpecification is called on this class , from digging through the code I knew the sqlString was defined at this time, so i do a simple map lookup and return if found, or compile , push to map , return if not.
I identified the StatementQuery as the only costly part, so I cache that state-less element and cannot really see any drawbacks like you mentioned regarding closing sessions etc. 
 
Thanks for discussing this matter in great depth btw , I'm a large fan of HSQLDB and have used it in projects @ my company for several years.
 
Daniel 
 
 
 
On 26 May 2016 at 20:48, Fred Toussi <[hidden email]> wrote:
There was a global Prepared Statement cache in version 1.8. This was too complex as it had to keep track of closed sessions and statements closing on some sessions and not on others. The cache was moved to individual sessions in version 2.0.
 
You can use a connection pool. There are two options. The org.hsqldb.jdbc.JDBCPool or a generic JDBC pool.
 
The org.hsqldb.Session class has a resetSession() method that calls statementManager.reset(). The JDBCPool calls the resetSession() method. Therefore you need to modify either class to not reset the StatementManager. Third party pools may or may not be resetting the HSQLDB session upon reuse, so you need to check.
 
Fred Toussi
 
 
 
On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote:
Hi Devs,
 
I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 
 
Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 
 
I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.
 
This had a massive benefit to performance because the statements being compiled are quite complex.
 
Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?
 
Thanks,
Daniel Walsh
Software Dev 
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Global Statement Compilation in HSQLDB

Daniel Walsh
I made these changes against 2.3.4 , would I be able to get SVN access to branch the project and implement a proposed addition ? 

Ideally i'd like to share my results with you and convince you of the magnitude of the performance gain, so this feature can be re-integrated into the main codeline. I'll run some test cases implementing the cache as you proposed and look for side-effects in my data, if all looks good i'll submit it to my fork for you to review ? 

Daniel 

On 27 May 2016 at 11:49, Fred Toussi <[hidden email]> wrote:
In recent versions of HSQLDB, Statement objects are not generally mutable and can be shared concurrently. The only mutable aspect is the setCursorPropertiesRequest() method that is checked for in StatementManager.compile(). Statements that are out-of-date are also rejected by the same method.
 
Fred
 
On Fri, May 27, 2016, at 11:26, Daniel Walsh wrote:
I initially added the Statement Cache to the Statement Manager class, but I realized that sharing a Statement object concurrently between many connections would mean mixing state between concurrent users causing functional errors. To do that correctly I would need to implement a global cache of statements that never get "used" , but rather on retrieval get cloned , so their state isn't shared, and then that Statement follows a regular lifecycle, being returned to the connection, pooled as normally done, closed and destroyed as usual. But the object in the cache would be retained, un-modified after creation , used as a template so-to-speak. 
 
And that whole solution as described above seemed to be less performant than just caching the non-modifiable aspect of the Statement objects and let them be constructed natively as is currently done. 
 
Am I correct in thinking that sharing statement objects across connections would cause data inconsistencies? It appears that way. 
 
Daniel 
 
 
On 27 May 2016 at 11:07, Fred Toussi <[hidden email]> wrote:
If you limit statement caching to PreparedStatement the best place for the cache would be the StatementManager.
 
The StatementManager compile() method checks if the SQL and its Statement object already exists. The Statement is removed when it is closed. All Statements are cleared when the connection and the session are closed. So for your usage, you can put a non-static ConcurrentMap in org.hsqldb.Database and each time a statement is compiled, you add to the ConcurrentMap and each time the lookup in compile() does not find the SQL and its statement, you try retrieving from the Concurrent map and let the compile() method perform its other checks. Note there is a different map for each schema (same SQL compiled with a different current schema can target different tables). Therefore you need  a map of schema name to ConcurrentMap to be precise. This method doesn't need changes to resetSession and works with or without a connection pool.
 
Fred
 
On Fri, May 27, 2016, at 09:49, Daniel Walsh wrote:
I'll investigate using a Connection Pool , i'm currently using the apache BasicDataSource for connection retrieval.
 
However if every time I retrieve from the pool if resetSession is called doesn't that clear the statement manager as you indicate, wiping out any performance gains of persisting compiled PreparedStatements between separate usages of the connection? Essentially I have many concurrent users hitting the same statement structures with different values for ? , ? , ? parameters. The statements are programatically generated, and quite finite, between 100-5000 thousand possible statements depending on the scale of the configuration. 
 
For this reason at the level of the class ParserDQL I added a very basic hack ( since this class is called in a stateful way , where fields are set on it - so i was a bit careful ) , I added a private static final ConcurrentMap of SQL String -> StatementQuery , on the basis that if the sql strings are identical their compiled queries must also be identical, I couldnt think of a scenario where they would need to be destroyed and recreated , if the DDL changes then the statement would also need to change by necessity, therefore the value in the cache would never get used again .... so except for the glaring memory leak at this moment it would work in principle. 
 
So when compileCursorSpecification is called on this class , from digging through the code I knew the sqlString was defined at this time, so i do a simple map lookup and return if found, or compile , push to map , return if not.
I identified the StatementQuery as the only costly part, so I cache that state-less element and cannot really see any drawbacks like you mentioned regarding closing sessions etc. 
 
Thanks for discussing this matter in great depth btw , I'm a large fan of HSQLDB and have used it in projects @ my company for several years.
 
Daniel 
 
 
 
On 26 May 2016 at 20:48, Fred Toussi <[hidden email]> wrote:
There was a global Prepared Statement cache in version 1.8. This was too complex as it had to keep track of closed sessions and statements closing on some sessions and not on others. The cache was moved to individual sessions in version 2.0.
 
You can use a connection pool. There are two options. The org.hsqldb.jdbc.JDBCPool or a generic JDBC pool.
 
The org.hsqldb.Session class has a resetSession() method that calls statementManager.reset(). The JDBCPool calls the resetSession() method. Therefore you need to modify either class to not reset the StatementManager. Third party pools may or may not be resetting the HSQLDB session upon reuse, so you need to check.
 
Fred Toussi
 
 
 
On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote:
Hi Devs,
 
I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 
 
Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 
 
I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.
 
This had a massive benefit to performance because the statements being compiled are quite complex.
 
Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?
 
Thanks,
Daniel Walsh
Software Dev 
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers



------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Global Statement Compilation in HSQLDB

Fred Toussi-2
We do accept contributions and I often fix any issues or side-effects they introduce.
 
Please implement the change as I proposed in the simplest possible way. The final implementation needs some setting to control the amount of statement caching (from default zero up to a given number) but you can leave it out for now as it touches lots of other unrelated source files. You can then send me a diff to review etc. SVN access comes after that.
 
Regards
 
Fred
 
On Fri, May 27, 2016, at 12:11, Daniel Walsh wrote:
I made these changes against 2.3.4 , would I be able to get SVN access to branch the project and implement a proposed addition ? 
 
Ideally i'd like to share my results with you and convince you of the magnitude of the performance gain, so this feature can be re-integrated into the main codeline. I'll run some test cases implementing the cache as you proposed and look for side-effects in my data, if all looks good i'll submit it to my fork for you to review ? 
 
Daniel 
 
On 27 May 2016 at 11:49, Fred Toussi <[hidden email]> wrote:
In recent versions of HSQLDB, Statement objects are not generally mutable and can be shared concurrently. The only mutable aspect is the setCursorPropertiesRequest() method that is checked for in StatementManager.compile(). Statements that are out-of-date are also rejected by the same method.
 
Fred
 
On Fri, May 27, 2016, at 11:26, Daniel Walsh wrote:
I initially added the Statement Cache to the Statement Manager class, but I realized that sharing a Statement object concurrently between many connections would mean mixing state between concurrent users causing functional errors. To do that correctly I would need to implement a global cache of statements that never get "used" , but rather on retrieval get cloned , so their state isn't shared, and then that Statement follows a regular lifecycle, being returned to the connection, pooled as normally done, closed and destroyed as usual. But the object in the cache would be retained, un-modified after creation , used as a template so-to-speak. 
 
And that whole solution as described above seemed to be less performant than just caching the non-modifiable aspect of the Statement objects and let them be constructed natively as is currently done. 
 
Am I correct in thinking that sharing statement objects across connections would cause data inconsistencies? It appears that way. 
 
Daniel 
 
 
On 27 May 2016 at 11:07, Fred Toussi <[hidden email]> wrote:
If you limit statement caching to PreparedStatement the best place for the cache would be the StatementManager.
 
The StatementManager compile() method checks if the SQL and its Statement object already exists. The Statement is removed when it is closed. All Statements are cleared when the connection and the session are closed. So for your usage, you can put a non-static ConcurrentMap in org.hsqldb.Database and each time a statement is compiled, you add to the ConcurrentMap and each time the lookup in compile() does not find the SQL and its statement, you try retrieving from the Concurrent map and let the compile() method perform its other checks. Note there is a different map for each schema (same SQL compiled with a different current schema can target different tables). Therefore you need  a map of schema name to ConcurrentMap to be precise. This method doesn't need changes to resetSession and works with or without a connection pool.
 
Fred
 
On Fri, May 27, 2016, at 09:49, Daniel Walsh wrote:
I'll investigate using a Connection Pool , i'm currently using the apache BasicDataSource for connection retrieval.
 
However if every time I retrieve from the pool if resetSession is called doesn't that clear the statement manager as you indicate, wiping out any performance gains of persisting compiled PreparedStatements between separate usages of the connection? Essentially I have many concurrent users hitting the same statement structures with different values for ? , ? , ? parameters. The statements are programatically generated, and quite finite, between 100-5000 thousand possible statements depending on the scale of the configuration. 
 
For this reason at the level of the class ParserDQL I added a very basic hack ( since this class is called in a stateful way , where fields are set on it - so i was a bit careful ) , I added a private static final ConcurrentMap of SQL String -> StatementQuery , on the basis that if the sql strings are identical their compiled queries must also be identical, I couldnt think of a scenario where they would need to be destroyed and recreated , if the DDL changes then the statement would also need to change by necessity, therefore the value in the cache would never get used again .... so except for the glaring memory leak at this moment it would work in principle. 
 
So when compileCursorSpecification is called on this class , from digging through the code I knew the sqlString was defined at this time, so i do a simple map lookup and return if found, or compile , push to map , return if not.
I identified the StatementQuery as the only costly part, so I cache that state-less element and cannot really see any drawbacks like you mentioned regarding closing sessions etc. 
 
Thanks for discussing this matter in great depth btw , I'm a large fan of HSQLDB and have used it in projects @ my company for several years.
 
Daniel 
 
 
 
On 26 May 2016 at 20:48, Fred Toussi <[hidden email]> wrote:
There was a global Prepared Statement cache in version 1.8. This was too complex as it had to keep track of closed sessions and statements closing on some sessions and not on others. The cache was moved to individual sessions in version 2.0.
 
You can use a connection pool. There are two options. The org.hsqldb.jdbc.JDBCPool or a generic JDBC pool.
 
The org.hsqldb.Session class has a resetSession() method that calls statementManager.reset(). The JDBCPool calls the resetSession() method. Therefore you need to modify either class to not reset the StatementManager. Third party pools may or may not be resetting the HSQLDB session upon reuse, so you need to check.
 
Fred Toussi
 
 
 
On Thu, May 26, 2016, at 19:41, Daniel Walsh wrote:
Hi Devs,
 
I have a question I couldn't find an answer for online or by investigating the HSQLDB code myself. 
 
Is there a reason currently in the implementation that the Stateless aspect of a Prepared Statement when compiled isn't stored in a global non-connection level cache ? 
 
I have a use case where connections are created and destroyed regularly and each connection calls 100+ prepared statement creation and executions during its lifecycle, in a test case that encompasses many other factors I increased performances by creating a naive low level cache of the compiled statement to be used each time a new PreparedStatement with a matching sql string was created.
 
This had a massive benefit to performance because the statements being compiled are quite complex.
 
Is this a feature that hasn't been implemented on purpose ? Or just an open gap ?
 
Thanks,
Daniel Walsh
Software Dev 
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
_______________________________________________
hsqldb-developers mailing list

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
hsqldb-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/hsqldb-developers
Loading...