Row Numbering in ORC Automatic incrementing while inserting row

May 31, 2015, 8:51 am

≪ Previous: Unable to login to Ambari on Sandbox via port 8080

Replies: 0

I would like to add Row Numbering or Row Sequence ID column which will automatically increment row id value. Hive UDF UDFRowSequence can be used but it runs in single reducer. I would like to know is there any other feature in latest hive 0.14 to increment row sequence automatically in oRC.

↧

distcp failing to trigger MR

May 31, 2015, 10:32 pm

≫ Next: HIVE Table Updates/Deletes

≪ Previous: Row Numbering in ORC Automatic incrementing while inserting row

Replies: 0

Please check this thread for more info.

When I’m executing a distcp command to copy from local file system to hdfs, I’m getting the following error :

15/06/01 13:53:37 INFO tools.DistCp: DistCp job-id: job_1431689151537_0003
15/06/01 13:53:37 INFO mapreduce.Job: Running job: job_1431689151537_0003
15/06/01 13:53:44 INFO mapreduce.Job: Job job_1431689151537_0003 running in uber mode : false
15/06/01 13:53:44 INFO mapreduce.Job: map 0% reduce 0%
15/06/01 13:53:47 INFO mapreduce.Job: Task Id : attempt_1431689151537_0003_m_000000_1000, Status : FAILED
java.io.FileNotFoundException: File /opt/dev/sdb/hadoop/yarn/local/filecache does not exist

15/06/01 13:53:51 INFO mapreduce.Job: Task Id : attempt_1431689151537_0003_m_000000_1001, Status : FAILED
java.io.FileNotFoundException: File /opt/dev/sdd/hadoop/yarn/local/filecache does not exist

15/06/01 13:53:55 INFO mapreduce.Job: Task Id : attempt_1431689151537_0003_m_000000_1002, Status : FAILED
java.io.FileNotFoundException: File /opt/dev/sdh/hadoop/yarn/local/filecache does not exist

15/06/01 13:54:02 INFO mapreduce.Job: map 100% reduce 0%
15/06/01 13:54:02 INFO mapreduce.Job: Job job_1431689151537_0003 completed successfully
15/06/01 13:54:02 INFO mapreduce.Job: Counters: 34
File System Counters
FILE: Number of bytes read=41194
FILE: Number of bytes written=118592
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=372
HDFS: Number of bytes written=41194
HDFS: Number of read operations=15
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters
Failed map tasks=3
Launched map tasks=4
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=10090
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=10090
Total vcore-seconds taken by all map tasks=10090
Total megabyte-seconds taken by all map tasks=10332160
Map-Reduce Framework
Map input records=1
Map output records=0
Input split bytes=114
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=31
CPU time spent (ms)=760
Physical memory (bytes) snapshot=223268864
Virtual memory (bytes) snapshot=2503770112
Total committed heap usage (bytes)=1171259392
File Input Format Counters
Bytes Read=258
File Output Format Counters
Bytes Written=0
org.apache.hadoop.tools.mapred.CopyMapper$Counter
BYTESCOPIED=41194
BYTESEXPECTED=41194
COPY=1

↧

HIVE Table Updates/Deletes

June 1, 2015, 12:40 am

≫ Next: Port in use when starting Name Node

≪ Previous: distcp failing to trigger MR

Replies: 0

Has any one worked on updating/deleting the Hive tables. For some reason it is not working for me. I have created the table with Tableproperties=’transactional’ and ensured that hive.support concurrency is set to true, hive.enforce.bucketing=true,hive.exec.dynamic.partition.mode nonstrict.

Appreciate if some one can help or publish any tutorial with detailed steps

↧

Port in use when starting Name Node

June 1, 2015, 3:30 am

≫ Next: Permission Denied for Hive Query

≪ Previous: HIVE Table Updates/Deletes

Replies: 1

Hi all,

When starting the name node by the ambari interface this fails and the log generated is below:

STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r 22a563ebe448969d07902aed869ac13c652b2872; compiled by ‘jenkins’ on 2015-03-31T19:49Z
STARTUP_MSG: java = 1.7.0_67
************************************************************/
2015-05-28 20:41:19,307 INFO namenode.NameNode (SignalLogger.java:register(91)) – registered UNIX signal handlers for [TERM, HUP, INT]
2015-05-28 20:41:19,310 INFO namenode.NameNode (NameNode.java:createNameNode(1367)) – createNameNode []
2015-05-28 20:41:19,693 INFO impl.MetricsConfig (MetricsConfig.java:loadFirst(111)) – loaded properties from hadoop-metrics2.properties
2015-05-28 20:41:19,894 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(62)) – Initializing Timeline metrics sink.
2015-05-28 20:41:19,895 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(80)) – Identified hostname = NAMENODE
, serviceName = namenode
2015-05-28 20:41:19,975 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(92)) – Collector Uri: http://NAMENODE:6188/ws/v1/timeline/metrics
2015-05-28 20:41:19,987 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:start(195)) – Sink timeline started
2015-05-28 20:41:20,065 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(376)) – Scheduled snapshot period at 10 second(s).
2015-05-28 20:41:20,065 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:start(191)) – NameNode metrics system started
2015-05-28 20:41:20,066 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(349)) – fs.defaultFS is hdfs://NAMENODE
:8020
2015-05-28 20:41:20,067 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(369)) – Clients are to use NAMENODE
:8020 to access this namenode/service.
2015-05-28 20:41:20,244 INFO hdfs.DFSUtil (DFSUtil.java:httpServerTemplateForNNAndJN(1694)) – Starting Web-server for hdfs at: http://NAMENODE
:50070
2015-05-28 20:41:20,289 INFO mortbay.log (Slf4jLog.java:info(67)) – Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-05-28 20:41:20,298 INFO http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) – Http request log for http.requests.namenode is not defined
2015-05-28 20:41:20,309 INFO http.HttpServer2 (HttpServer2.java:addGlobalFilter(699)) – Added global filter ‘safety’ (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-05-28 20:41:20,314 INFO http.HttpServer2 (HttpServer2.java:addFilter(677)) – Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs
2015-05-28 20:41:20,314 INFO http.HttpServer2 (HttpServer2.java:addFilter(684)) – Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-05-28 20:41:20,314 INFO http.HttpServer2 (HttpServer2.java:addFilter(684)) – Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-05-28 20:41:20,343 INFO http.HttpServer2 (NameNodeHttpServer.java:initWebHdfs(86)) – Added filter ‘org.apache.hadoop.hdfs.web.AuthFilter’ (class=org.apache.hadoop.hdfs.web.AuthFilter)
2015-05-28 20:41:20,345 INFO http.HttpServer2 (HttpServer2.java:addJerseyResourcePackage(603)) – addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2015-05-28 20:41:20,363 INFO http.HttpServer2 (HttpServer2.java:start(830)) – HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: NAMENODE
:50070
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:891)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:827)
at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:703)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:590)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:886)
… 8 more
2015-05-28 20:41:20,366 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) – Stopping NameNode metrics system…
2015-05-28 20:41:20,366 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(135)) – timeline thread interrupted.
2015-05-28 20:41:20,366 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216)) – NameNode metrics system stopped.
2015-05-28 20:41:20,367 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(605)) – NameNode metrics system shutdown complete.
2015-05-28 20:41:20,367 FATAL namenode.NameNode (NameNode.java:main(1509)) – Failed to start namenode.
java.net.BindException: Port in use: NAMENODE
:50070
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:891)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:827)
at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:703)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:590)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:444)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:886)
… 8 more
2015-05-28 20:41:20,369 INFO util.ExitUtil (ExitUtil.java:terminate(124)) – Exiting with status 1
2015-05-28 20:41:20,370 INFO namenode.NameNode (StringUtils.java:run(659)) – SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at NAMENODE
/IPNAMENODE
************************************************************/

Can some one help me on this.

Thanks in advance,
Ricardo Marques.

↧

Permission Denied for Hive Query

June 1, 2015, 9:54 am

≫ Next: How to tell what Ambari version is running?

≪ Previous: Port in use when starting Name Node

Replies: 0

Using a newly installed sandbox and using Hue with user id hue attempting to select from the supplied sample file:

select * from sample_07 limit 10
..
fails with:

Error occurred executing hive query: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [hue] does not have [USE] privilege on [default]

The hdfs file is owned by user hue. How does one grant access for use privilege on the default database?
The same error occurs using beeline. What is needed to use these tools to query the sample tables?

↧

How to tell what Ambari version is running?

June 1, 2015, 11:04 am

≫ Next: Set up of MapReduce JobHistory Server fails.

≪ Previous: Permission Denied for Hive Query

Replies: 3

I have a number of clusters, so forgive me if I don’t “just remember”
Everything’s running fine.
the web gui tells me it’s hdp 2.2 and gives me the versions of the hdp2.2 stack

but ambari isn’t included in the “stack” report at http://…/#/main/admin/repositories

I was just checking the hortonworks web page, and noticed ambari 2.0 is what it should be…since ambari and hdp stack can move independently, I was thinking I should check that when I installed hdp2.2 I did everything for ambari 2.0

the reason I’m wondering, is I see nagios and ganglia in my web gui, and just noticed that ambari 2.0 should have those taken out.
so before i started to do that, I was wondering “do I have ambari 2.0″

how do I tell?
I clicked around for a couple minutes..couldn’t find anything.

thanks,
-kevin

↧

Set up of MapReduce JobHistory Server fails.

June 1, 2015, 8:16 pm

≫ Next: Zookeeper connection refused

≪ Previous: How to tell what Ambari version is running?

Replies: 1

Hello,

I have configured a three node HA cluster on CentOS 7 (hadoop-1,hadoop-2,hadoop-3) with the following services:
hadoop-1 –> JournalNode, NameNode, DataNode, DFSZKFailoverController, ResourceManager, NodeManager
hadoop-2 –> JournalNode, NameNode, DataNode, DFSZKFailoverController
hadoop-3 –> DataNode, zookeeper (as third node for the quorum)

This is my first hadoop installation and the above services seems to be running as expected but then I realized that I had
skipped the JobHistory server creation/start steps.

I went back and when I tried to create the directories for the JobHistory server, following the steps mentioned here

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap4-4.html

I have the following error:

$ hadoop fs -mkdir -p /mr-history/tmp
mkdir: Couldn’t create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

My understanding of hadoop is not so deep yet to realize by myself what can be the reason of the problem. I only know that
in hdfs-site.xml I had the following entry that could be related to that error:

<property>
<name>dfs.client.failover.proxy.provider.qbs-mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

but that entry seems to be right to me (unless I’m wrong here).

I don’t know if this information is enough for someone in this forum to be able to give some help here and to guide me in the
right direction to solve this problem.

Regards,

SysAdmin1

↧

Zookeeper connection refused

June 2, 2015, 1:12 am

≫ Next: Hive ORC

≪ Previous: Set up of MapReduce JobHistory Server fails.

Replies: 2

Hello,

I installed HDP 1.3 cluster using Ambari. As I create hbase table in HIVE shell, log shows following error:

2013-10-03 12:00:39,644 WARN zookeeper.ClientCnxn (ClientCnxn.java:run(1089)) – Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2013-10-03 12:00:39,760 WARN zookeeper.RecoverableZooKeeper (RecoverableZooKeeper.java:retryOrThrow(219)) – Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase-unsecure/hbaseid
2013-10-03 12:00:40,759 WARN zookeeper.ClientCnxn (ClientCnxn.java:run(1089)) – Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2013-10-03 12:00:41,862 WARN zookeeper.ClientCnxn (ClientCnxn.java:run(1089)) – Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

I did specify the parent node as “hbase-unsecure” in hive-core.xml. what is the reason?

1. the version of hbase/ZK in HIVE lib are not the same in cluster?
2. Hive attempted to connect ZK on localhost? how can I specify ZK host?

Thanks!

↧

Hive ORC

June 2, 2015, 2:54 am

≫ Next: HBase dashboard

≪ Previous: Zookeeper connection refused

Replies: 4

I tried creating Hive (0.14) table using stored as ORC. When inserting from query editor it failed.
so I tried creating the table using bucket and table properties (transaction true) and insert data from query editor worked. However, oozie workflow doing the same insert failed.

The question I have is
1. Does Hive 0.14 mandates specifying bucketing and table properties as I mentioned above ?
2. How can I insert records successfully via oozie workflow?

Please advise,
Thanks.

↧

HBase dashboard

June 2, 2015, 7:11 am

≫ Next: HDFS Transparent Data Encryption

≪ Previous: Hive ORC

Replies: 1

Is there a HBase dashboard available in HUE or other?

↧

HDFS Transparent Data Encryption

June 2, 2015, 9:15 pm

≫ Next: MALLOC_ARENA_MAX, where to set

≪ Previous: HBase dashboard

Replies: 2

I have setup HDFS Transparent Data Encryption by referring http://hortonworks.com/kb/hdfs-transparent-data-encryption/ and I can see encrypted data in raw namespace. my question is how should I test my setup. can you give me some test cases ? also is in-transit encryption supported in hdp2.2 ? if yes can you suggest test cases for the same too ? thanks in advance

↧

MALLOC_ARENA_MAX, where to set

June 3, 2015, 12:24 am

≫ Next: Pig Jobs fail silently in HUE

≪ Previous: HDFS Transparent Data Encryption

Replies: 1

I have a yarn application (camus – for reading data from Kafka, and writing to HDFS) which is running out of virtual memory. It is using 4Gb of real, and 11 of virtual memory. I am worried that – because I am on RHEL6 I might be hitting the glibc arena bug/feature where the rare mallocs done use up far more virtual memory than they should.

The documentation in several places suggests setting the env variable MALLOC_ARENA_MAX (setting it to 4 or even 1) but I am not sure where.

I see that yarn.nodemanager.admin-env is set to MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
So I went to the yarn-env template in Ambari and added

export MALLOC_ARENA_MAX=4

(I also restarted all the node managers in a rolling restart).

No change. Is this right?
Perhaps I should just override yarn.nodemanager.admin-env setting it to a hard coded value directly.

Any ideas?

↧

Pig Jobs fail silently in HUE

June 3, 2015, 12:35 am

≫ Next: Cannot login to Ambari

≪ Previous: MALLOC_ARENA_MAX, where to set

Replies: 1

Attempting to run any Pig jobs in HUE results in the job being submitted, but the progress is perpetually at 0%

there are no error messsages, and no log entries that indicate what has happened.

I’m using ambaro-managed hortonworks 2.2.2 and hue 3.8.1 on RHEL 6.5

↧

Cannot login to Ambari

June 3, 2015, 10:23 am

≫ Next: Aprimo CRM on Hive.

≪ Previous: Pig Jobs fail silently in HUE

Replies: 6

I am new to Hadoop and started learning it recently. I installed Horton Works Sandbox today. I was able to login to Hue using 127.0.0.1:8000. But not to Ambari on 127.0.0.1:8080. It asks for username and password for which I provided admin/admin. But it keep on asking me to enter username/password. Could you please help me?

↧

Aprimo CRM on Hive.

June 3, 2015, 6:36 pm

≫ Next: Hue runsupoervisor

≪ Previous: Cannot login to Ambari

Replies: 4

We are trying to work on a POC where we would like to do some campaign management using Aprimo CRM tool on Hadoop. As of now, the tool uses Teradata as the backend. Storing data in Teradata is proving costly and would want to use Hadoop as the data source.
we are running HDP 2.1.
1. We are able to connect the Aprimo Marketing Studio application to HiveServer2
2. We are able to view all the tables created in the default database.
3. When we try to query a table, following error occurs “Error: Object reference not set to an instance of an object.”

Request help on this.

↧

Hue runsupoervisor

June 3, 2015, 8:53 pm

≫ Next: What Replication Factor does MR use?

≪ Previous: Aprimo CRM on Hive.

Replies: 0

Hi
Could any one tell me whats the issue here
/usr/lib/hue/build/env/bin/hue runcpserver
starting server with options {‘ssl_certificate': None, ‘workdir': None, ‘server_name': ‘localhost’, ‘host': ‘0.0.0.0’, ‘daemonize': False, ‘threads': 10, ‘pid
file': None, ‘ssl_private_key': None, ‘server_group': ‘hadoop’, ‘port': 8000, ‘server_user': ‘hue’}
Traceback (most recent call last):
File “/usr/lib/hue/build/env/bin/hue”, line 9, in <module>
load_entry_point(‘desktop==2.5.1′, ‘console_scripts’, ‘hue’)()
File “/usr/lib/hue/desktop/core/src/desktop/manage_entry.py”, line 60, in entry
execute_manager(settings)
File “/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/management/__init__.py”, line 438, in execute_manager
utility.execute()
File “/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/management/__init__.py”, line 379, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File “/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/management/base.py”, line 191, in run_from_argv
self.execute(*args, **options.__dict__)
File “/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/management/base.py”, line 220, in execute
output = self.handle(*args, **options)
File “/usr/lib/hue/desktop/core/src/desktop/management/commands/runcherrypyserver.py”, line 63, in handle
runcpserver(args)
File “/usr/lib/hue/desktop/core/src/desktop/management/commands/runcherrypyserver.py”, line 109, in runcpserver
start_server(options)
File “/usr/lib/hue/desktop/core/src/desktop/management/commands/runcherrypyserver.py”, line 85, in start_server
server.bind_server()
File “/usr/lib/hue/desktop/core/src/desktop/lib/wsgiserver.py”, line 1629, in bind_server
raise socket.error, msg
socket.error: [Errno 98] Address already in use

↧

What Replication Factor does MR use?

June 3, 2015, 8:59 pm

≫ Next: Best Match query o/p

≪ Previous: Hue runsupoervisor

Replies: 1

I have a weird problem with a number of different MapReduce jobs. When they are being submitted the “job.jar” and “job.split” files are being created with a target replica setting of 10.
I cannot see this value anywhere in the config.
It is a pain because I don’t have 10 data nodes in my cluster so I temporarily have under replicated blocks. This would be fine if they dissappeared but sometimes when the system fails they stay around.
I have tried setting dfs.replication.max to 5 (the number of data nodes I have) and the MapReduce jobs all refuse to start! (It was previously set to 50)

hdfs fsck / | grep Under
Connecting to namenode via http://mynamenode.local:50070
/user/example/.staging/job_1430325670718_7300/job.jar: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575400_3836552. Target Replicas is 10 but found 5 replica(s).
/user/example/.staging/job_1430325670718_7300/job.split: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575404_3836556. Target Replicas is 10 but found 5 replica(s).
/user/example/.staging/job_1430325670718_7302/job.jar: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575414_3836566. Target Replicas is 10 but found 5 replica(s).
/user/example/.staging/job_1430325670718_7302/job.split: Under replicated BP-1255772799-10.34.37.1-1421676659908:blk_1077575417_3836569. Target Replicas is 10 but found 5 replica(s).
Under-replicated blocks: 4 (4.899343E-4 %)

And a few minutes later

hdfs fsck / | grep Under
Connecting to namenode via http://bruathdp001.iggroup.local:50070
Under-replicated blocks: 0 (0.0 %)

↧

Best Match query o/p

June 3, 2015, 9:41 pm

≫ Next: Problem with Ubuntu repositories and associated keys

≪ Previous: What Replication Factor does MR use?

Replies: 0

Hi All,

Need urgent solution.

I have a Hive query with 1 AND clause and 10 OR clause. WHen I execute the hive query I get 5 records as O/P. My requirement is to fetch the best matched record. In this case the record which has max OR matches should be listed and ignore other records.

Kindly respond on priority.

↧

Problem with Ubuntu repositories and associated keys

June 3, 2015, 11:33 pm

≫ Next: hue in HDP 2.1

≪ Previous: Best Match query o/p

Replies: 1

I am trying to install small HDP cluster with Ambari 2.0.0 on a small openstack cloud.

I have two problems:

1) when configuring the Ambari server node I cannot install the repo keys.

When I run the following command:

sudo apt-key adv –recv-keys –keyserver keyserver.ubuntu.com B9733A7A07513CAD # As per the documentation

I get the following results:

xecuting: gpg –ignore-time-conflict –no-options –no-default-keyring –secret-keyring /tmp/tmp.Ai0poj4sKp –trustdb-name /etc/apt/trustdb.gpg –keyring /etc/apt/trusted.gpg –primary-keyring /etc/apt/trusted.gpg –recv-keys –keyserver keyserver.ubuntu.com B9733A7A07513CAD
gpg: requesting key 07513CAD from hkp server keyserver.ubuntu.com
gpgkeys: key B9733A7A07513CAD not found on keyserver
gpg: no valid OpenPGP data found.
gpg: Total number processed: 0

I get the same results when i use the MIT keyserver.

2) In the Ambari installer on the Select Stack page the repositories listed for Ubuntu will not validate.

They are:

http://public-repo-1.hortonworks.com/HDP/ubuntu12/2.x/GA/2.2.0.0
http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/ubuntu12

I cannot continue unless I check the Skip Repository Base URL validation (Advanced) option
If I select this option the installation fails because the repositories listed do not exist and the kee (see above) does not exist.

During the register phase the following error occurs:

W: GPG error: http://public-repo-1.hortonworks.com Ambari Release: The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY B9733A7A07513CAD
W: GPG error: http://public-repo-1.hortonworks.com HDP-UTILS Release: The following signatures couldn’t be verified because the public key is not available: NO_PUBKEY B9733A7A07513CAD
W: Failed to fetch http://public-repo-1.hortonworks.com/HDP/ubuntu12/2.x/updates/2.1.7.0/dists/HDP/main/binary-amd64/Packages 404 Not Found

I do not know what to do from here.
It seems to me that the suggested repos should work. Please advise

↧

hue in HDP 2.1

June 3, 2015, 11:42 pm

≫ Next: HDP upgrade not finalized

≪ Previous: Problem with Ubuntu repositories and associated keys

Replies: 3

Hi,
I am trying to install hue by following the and when I do yum install hue ,It installs the hue-2.5.1.2.1.7.0-784.el6.x86_64 vesrion whereas the latest version is 3.7.1 .
How can I get the latest version of hue .
After installation when try to start the hue it says ok but when I try to load the hue page https://localhost :8000 ,it says page not found.

↧