[Mulgara-general] Mulgara size limits

Alex Hall alexhall at revelytix.com
Tue Jul 1 12:35:16 PDT 2008


Chuck Borromeo wrote:
> Hello,
>   OK I swapped the parser.  That didn't work.

Well, it got us closer :-)

> The parser doesn't run out of memory, but it appears that Mulgara runs out of memory.  I ran the Mulgara server with the following settings:
> 
> java -d64 -Xmx4096m -jar mulgara-2.0-alpha.jar --serverconfig file:mulgara-config.xml
> 
> Here is the stack trace:
[snip]

That's a bit disheartening, because I think the developers on this list 
will agree that the behavior you're seeing *shouldn't* be happening. 
The logical next step would be to get a heap profile.  You can do this 
by adding "-Xrunhprof:heap=sites" as a Java VM argument.  The resulting 
java.hprof.txt will probably be too big to send to the list in its 
entirety, but if you can cut and paste the table at the end of the file, 
or post the whole file somewhere for us to download, somebody can take a 
look and see if there's anything out of the ordinary.

Regards,
Alex

> --- On Tue, 7/1/08, Paul Gearon <gearon at ieee.org> wrote:
> 
>> From: Paul Gearon <gearon at ieee.org>
>> Subject: Re: [Mulgara-general] Mulgara size limits
>> To: "Mulgara General" <mulgara-general at mulgara.org>
>> Date: Tuesday, July 1, 2008, 1:18 PM
>> On Tue, Jul 1, 2008 at 9:30 AM, Seaborne, Andy
>> <andy.seaborne at hp.com> wrote:
>>>> [snip]
>>>>
>>>> The out-of-memory exception is being thrown in the
>> Jena ARP parser,
>>>> which we use for parsing RDF/XML.  Looking quickly
>> at the Jena source, I
>>>> see that ARP keeps a bunch of stuff in memory,
>> namely blank node
>>>> mappings, so it doesn't surprise me that it
>> runs out of memory for large
>>>> files.
>>> If it's the check for illegal reuse of bNodes ids,
>> then may be it's an old version of ARP? Nowadays it
>> issues a warning about being unable to track illegal reuse
>> of bNode ids across the whole file and stops that checking
>> while continuing parsing.  If you're using an old
>> version, then if it's using an old version of Xerces,
>> that might also be a factor.
>>
>> I'm pretty sure that no one has updated this in a long
>> time, so it's
>> pretty much guaranteed to be an old version of ARP. Xerces
>> got updated
>> about 2 years ago, but I can't recall if it's
>> happened again since
>> then.
>>
>> I didn't realize that the ARP code was keeping blank
>> node mappings in
>> memory. Whoever implemented the content handler with ARP
>> must not have
>> known this, as the content handler maintains it's own
>> mappings. These
>> mappings are file-based, so they shouldn't have memory
>> restrictions,
>> though I can't comment on their speed, as I don't
>> know how well they
>> cache in memory.
>>
>> Alex's comment about RIO reminds me.... wasn't this
>> parser supposed to
>> replace the ARP-based one?
>>
>> Paul
>> _______________________________________________
>> Mulgara-general mailing list
>> Mulgara-general at mulgara.org
>> http://mulgara.org/mailman/listinfo/mulgara-general
> 
> 
>       
> _______________________________________________
> Mulgara-general mailing list
> Mulgara-general at mulgara.org
> http://mulgara.org/mailman/listinfo/mulgara-general


More information about the Mulgara-general mailing list