How to get from Postgres database to Hadoop Sequence File?
I need to get data from a Postgres database to an Accumulo database. We're
hoping to using sequence files to run map/reduce job to do this, but
aren't sure how to start. For internal technical reasons, we need to avoid
Sqoop.
Will this be possible without Sqoop? Again, I'm really not sure where to
start. Do I write a java class to read all records (millions) into JDBC
and somehow output that to an HDFS sequence file?
Thanks for any input!
No comments:
Post a Comment