Quantcast
Viewing all articles
Browse latest Browse all 22116

Dockerfile build distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

Using docker to test and develop an ETL data pipeline with Airflow and AWS glue. I'm currently using this blog post as a guide to launch the containers: https://towardsdatascience.com/develop-glue-jobs-locally-using-docker-containers-bffc9d95bd1 (Dockerfile github link: https://github.com/jnshubham/aws-glue-local-etl-docker/blob/master/Dockerfile). When I run docker build -t glue:latest I get the error below. The error is caused by RUN pip install 'apache-airflow[postgres]'==1.10.10 --constraint https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt within the dockerfile. I've googled solutions for the first error and tried adding RUN yum install -y python3-devel to the dockerfile but still got the same error. I've also read that it may have to do with the gcc version. Currently it's:

Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/4.2.1Apple clang version 11.0.3 (clang-1103.0.32.62)Target: x86_64-apple-darwin19.4.0Thread model: posixInstalledDir: /Library/Developer/CommandLineTools/usr/bin

docker build -t glue:latest . Error:

    Running setup.py install for psutil: started    Running setup.py install for psutil: finished with status 'error'    ERROR: Command errored out with exit status 1:     command: /usr/bin/python3.6 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-ndmkn_ag/psutil/setup.py'"'"'; __file__='"'"'/tmp/pip-install-ndmkn_ag/psutil/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-nduz8awp/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.6m/psutil    gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_SIZEOF_PID_T=4 -DPSUTIL_VERSION=570 -DPSUTIL_LINUX=1 -DPSUTIL_ETHTOOL_MISSING_TYPES=1 -I/usr/include/python3.6m -c psutil/_psutil_common.c -o build/temp.linux-x86_64-3.6/psutil/_psutil_common.o    unable to execute 'gcc': No such file or directory    Traceback (most recent call last):      File "/usr/lib64/python3.6/distutils/unixccompiler.py", line 127, in _compile        extra_postargs)      File "/usr/lib64/python3.6/distutils/ccompiler.py", line 909, in spawn        spawn(cmd, dry_run=self.dry_run)      File "/usr/lib64/python3.6/distutils/spawn.py", line 36, in spawn        _spawn_posix(cmd, search_path, dry_run=dry_run)      File "/usr/lib64/python3.6/distutils/spawn.py", line 159, in _spawn_posix        % (cmd, exit_status))    distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1

My dockerfile consist of:

FROM centos as glue# initialize package env variablesENV MAVEN=https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-common/apache-maven-3.6.0-bin.tar.gzENV SPARK=https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-1.0/spark-2.4.3-bin-hadoop2.8.tgzENV GLUE=https://github.com/awslabs/aws-glue-libs.git#install required packages needed for aws glueRUN yum install -y python3 java-1.8.0-openjdk java-1.8.0-openjdk-devel tar git wget zipRUN yum install -y python3-develRUN ln -s /usr/bin/python3 /usr/bin/pythonRUN ln -s /usr/bin/pip3 /usr/bin/pipRUN mkdir /usr/local/glueWORKDIR /usr/local/glueRUN git clone -b glue-1.0 $GLUERUN wget $SPARKRUN wget $MAVENRUN tar zxfv apache-maven-3.6.0-bin.tar.gzRUN tar zxfv spark-2.4.3-bin-hadoop2.8.tgzRUN rm spark-2.4.3-bin-hadoop2.8.tgzRUN rm apache-maven-3.6.0-bin.tar.gzRUN mv $(rpm -q -l java-1.8.0-openjdk-devel | grep "/bin$" | rev | cut -d"/" -f2- |rev) /usr/lib/jvm/jdkENV SPARK_HOME /usr/local/glue/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8ENV MAVEN_HOME /usr/local/glue/apache-maven-3.6.0ENV JAVA_HOME /usr/lib/jvm/jdkENV GLUE_HOME /usr/local/glue/aws-glue-libsENV PATH $PATH:$MAVEN_HOME/bin:$SPARK_HOME/bin:$JAVA_HOME/bin:$GLUE_HOME/binRUN sh aws-glue-libs/bin/glue-setup.sh#compile dependencies with maven buildRUN sed -i '/mvn -f/a rm /usr/local/glue/aws-glue-libs/jarsv1/netty-*' /usr/local/glue/aws-glue-libs/bin/glue-setup.shRUN sed -i '/mvn -f/a rm /usr/local/glue/aws-glue-libs/jarsv1/javax.servlet-3.*' /usr/local/glue/aws-glue-libs/bin/glue-setup.sh#clean tmp dirsRUN yum clean allRUN rm -rf /var/cache/yumENV AIRFLOW_HOME /usr/local/airflowWORKDIR /usr/local/srcCOPY requirements.txt ./RUN pip install --upgrade pip && \    pip install --no-cache-dir -r requirements.txt && \    pip install 'apache-airflow[postgres]'==1.10.10 \    --constraint https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txtRUN mkdir glue_etl_scriptsCOPY glue_etl_scripts/log_data.py glue_etl_scripts/log_data.pyRUN mkdir configCOPY config/aws.cfg /config/aws.cfgCOPY config/airflow.cfg $AIRFLOW_HOME/airflow.cfgRUN mkdir scriptsCOPY scripts/entrypoint.sh scripts/entrypoint.shCOPY scripts/connections.sh scripts/connections.shENTRYPOINT ["scripts/entrypoint.sh"]CMD ["webserver"]

Viewing all articles
Browse latest Browse all 22116

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>