Skip to content
This repository has been archived by the owner on May 10, 2024. It is now read-only.

PARQUET-451: Add RowGroupReader helper class and refactor parquet_reader.cc into DebugPrint #23

Closed
wants to merge 4 commits into from

Conversation

wesm
Copy link
Member

@wesm wesm commented Jan 23, 2016

This also addresses PARQUET-433 and PARQUET-453.

ColumnReader* reader = make_column_reader(&col.meta_data,
&this->parent_->metadata_.schema[i + 1], input.release());

column_readers_[i] = std::shared_ptr<ColumnReader>(reader);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a non-zero chance of memory leak if the program crashes before or during line 121. This is marginally better:
column_readers_[i].reset(make_column_reader(...));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make make_column_reader return a shared_ptr then combine these two lines.

@wesm
Copy link
Member Author

wesm commented Jan 25, 2016

@asandryh I addressed your reviews in #22 here. @nongli I packaged up the stack of CLs so you can git merge --ff-only after we are done reviewing

wesm added 2 commits January 26, 2016 17:18
switch-on-type statements. Add parquet::SchemaElement* member to Decoder<T>,
for FLBA metadata.
parquet_reader.cc into ParquetFileReader::DebugPrint
@wesm
Copy link
Member Author

wesm commented Jan 27, 2016

@nongli this has been rebased (the conflicts weren't too bad) and here's a green build

https://travis-ci.org/wesm/parquet-cpp/builds/105058183

I've been hassling Travis CI about the build problems and they are having infrastructure issues that has caused the build queue to stall. I think it's ridiculous, and the ASF is supposed to have 30 concurrent build slaves, so this really should not be happening.

pls let me know what other code comments you have and we can get this merged. Then @majetideepak can refactor #24 to be compatible with this revamped code structure

@asandryh
Copy link
Contributor

+1, LGTM. @nongli

@@ -269,4 +167,36 @@ bool ColumnReader::ReadNewPage() {
return true;
}

std::shared_ptr<ColumnReader> make_column_reader(const parquet::ColumnMetaData* metadata,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MakeColumnReader

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. What do you think about making this a static method like ColumnReader::Make ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, that would be better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done.

@nongli
Copy link
Contributor

nongli commented Jan 27, 2016

This looks good.

Can we update reader-test to read all the data files in the repo and verify some results? (as another PR).

@wesm
Copy link
Member Author

wesm commented Jan 27, 2016

Opened https://issues.apache.org/jira/browse/PARQUET-475. Will plan to write some more smoke tests to verify the data in those files.

Within short order (next few weeks), we really need to be able to round-trip data to files so that unit tests can generate test data and verify it can be read and written successfully.

@wesm
Copy link
Member Author

wesm commented Jan 27, 2016

@nongli thank you, good to go

@majetideepak
Copy link

I closed my pull request to avoid complicated rebasing with the current re-factored code.
Once #23 get pushed, I will submit a new pull request for PARQUET-428

@asfgit asfgit closed this in 1f24e76 Jan 28, 2016
@nongli
Copy link
Contributor

nongli commented Jan 28, 2016

Merged. Thanks!

@wesm wesm deleted the PARQUET-451 branch January 28, 2016 05:13
@wesm
Copy link
Member Author

wesm commented Jan 28, 2016

thank you sir!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants