Beam_installation123456785678777777.pptx

SravanthiVaka1 3 views 5 slides Apr 25, 2024
Slide 1
Slide 1 of 5
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5

About This Presentation

fghjkl


Slide Content

Installation Of Apache beam and Implementation Of Count Aggregation function Using Python SDK - Cloud & AI Analytics

Objective: Install apache beam Python sdk in Google cloud platform environment. Create a pipeline with PCollections and then apply Count to get the total number of elements in different ways such as Counting all elements in a PCollection Counting elements for each key Counting all unique elements Counting all elements in a Pcollection : Count.Globally()  to count  all  elements in a  PCollection , even if there are duplicate elements . Counting elements for each key: Count.PerKey()  to count the elements for each unique key in a  PCollection  of key-values .

Counting all unique elements: Count.PerElement ()  to count the only the unique elements in a  PCollection. Resources: https://cloud.google.com/dataflow/docs/guides/installing-beam-sdk#python https://beam.apache.org/documentation/ https://beam.apache.org/documentation/transforms/python/aggregation/count/

Error: Traceback (most recent call last): File "<string>", line 1, in <module> File "/ usr /lib/python3.7/tokenize.py", line 447, in open buffer = _ builtin_open (filename, ' rb ') FileNotFoundError : [ Errno 2] No such file or directory: '/ tmp /pip-install-041su91e/ orjson /setup.py' ---------------------------------------- Command "python setup.py egg_info " failed with error code 1 in / tmp /pip-install-041su91e/ orjson / Solution: A) System Install sudo python3 -m pip install -U pip sudo python3 -m pip install -U setuptools B ) Virtual Env / Pipenv #Within the venv pip3 install -U pip pip3 install -U setuptools Finally try installing again with pip3 install apache-beam

Thank You..!
Tags