Ryan February 2016

Improve speed cross-mapping array of arrays

just looking for a little help with a conversion from Perl to PHP. I utilized hashes to map values as keys across two arrays read in from two files. The files I am using aren't very big, roughly 150,000 rows in one, and 50,000 in the other. In Perl, this runs in roughly 10 seconds, but in PHP I've reduced the read-in file from 150,000 rows to around 20,000 rows and it takes nearly 3 minutes. I'm wondering if this is a limitation of the language or if my design is inherently flawed.

the two existing array of arrays are $ao_hash and $string_hash, built as follows:

// Load file contents
$file_contents = str_replace("\t","|",file_get_contents($_FILES['file']['tmp_name']));
$file_array = explode("\n",$file_contents);

// Pass client dictionary into an array of arrays
foreach ($file_array as $line) {
    $line_array = explode("|",$line);
    if (stripos($line_array[0], 'mnemonic') !== false) { 
        continue; 
    }

    if (!isset($line_array[1])) {
        continue;
    }

    if (stripos($line_array[1], 'n') !== false) {
        continue;
    }

    if (!isset($line_array[10])) {
        continue;
    }

    $ao_hash[$line_array[10]] = $line;
}

Both hashes are built using this method, and both work well (expected results, quick execution). It reads like this:

$array1[NDC] = some|delimited|file|output
$array2[NDC] = another|file|with|delimited|output

I'm using NDC as the primary key to cross-map both arrays.

// Compare the client's drug report against the cut-down file
while (list ($key, $value) = each ($ao_hash)) {

    // Use the NDC to match across array of arrays
    if (isset($string_hash[substr($key,0,11)])) {
        $string_selector = $string_hash[substr($key,0,11)];
    }

    // Check if the client NDC entry exists in cut-down file
    if (!isset($string_selector)) {

        // No direct NDC match, reserve for an FSV look-up
        $ao_array = explode("|", $valu        

Answers


Ryan February 2016

Redditor https://www.reddit.com/user/the_alias_of_andrea solved the issue:

Instead of using:

while (list($key, $value) = each($ao_hash))

it would be more efficient to use

foreach ($ao_hash as $key => $value)

Now a 13MB file executes immediately and I get the expected results.

Post Status

Asked in February 2016
Viewed 1,829 times
Voted 11
Answered 1 times

Search




Leave an answer