Back to search
PT. Indosat Tbk Linkedin · Posted 22d ago

Data Engineering

Indonesia

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Job Description

Company Overview:

Our organization is a leading innovator in cybersecurity, cloud, and AI solutions, dedicated to developing cutting-edge products and services that address the evolving needs of the technology landscape. We thrive in a rapidly developing market (Indonesia) where the demand for advanced tech solutions is ever-growing, driven by rapid technological advancements. We are an AI-native company committed to continuous improvement, helping our customers unlock their full revenue potential.

Role Summary

As a Data Engineer you will play a crucial role in building and managing the data pipelines that are essential for training and fine-tuning our Large Language Models (LLMs), with a specific focus on the Indonesian language. You will be responsible for designing, building, and maintaining a robust and scalable data infrastructure. You will collaborate closely with our team of Data Scientists and Machine Learning Engineers to ensure the availability of high-quality, clean, and structured Indonesian language data for developing accurate and locally relevant AI models.

Key Responsibilities

  • Build and Manage Data Pipelines: Design, develop, and maintain ETL (Extract, Transform, Load) processes to collect and process Indonesian text data from various sources, such as databases, APIs, and log files.
  • Data Collection and Integration: Gather complex and relevant datasets tailored to business needs, particularly Indonesian text data that covers a wide range of dialects and linguistic styles.
  • Data Cleaning and Pre-processing: Perform data cleaning to handle inconsistent, duplicate, or corrupted data. You will also transform raw data into a usable format for training machine learning models.
  • Data Architecture: Design and implement an efficient and scalable data architecture, including data warehouses and data lakes, to store and manage large volumes of data.
  • Ensure Data Quality: Develop data validation methods and analysis tools to ensure the integrity and accuracy of the data used for model training.
  • Team Collaboration: Work closely with Data Scientists to understand their data requirements and provide ready-to-use data for the fine-tuning and evaluation of LLM models.
  • Performance Optimization: Monitor and optimize the performance of data pipelines to ensure efficiency and scalability, especially when handling very large volumes of data.

Requirements

Qualifications & Experience:

Education

  • Required: Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related quantitative field.
  • Preferred: Master’s degree in a Computer Science.

Experience

  • Required: Minimum 3 to 5 years of hands-on experience in a data engineering role, particularly in projects involving big data and machine learning with a proven track record of designing and implementing data pipelines and architecture including data ingestion, storage, processing and delivery.
  • Preferred: Experience with Big Data technologies (e.g., Hadoop, Spark). Experience with cloud platforms (e.g., AWS, GCP, Azure) and their associated data services. Familiarity with DevOps/DataOps principles for CI/CD.

Required Skills

  • Technical Skills:
  • Programming Languages: High proficiency in programming languages such as Python, SQL, and Scala.
  • Databases: Deep understanding of relational databases (like MySQL, PostgreSQL) and NoSQL databases (like MongoDB).
  • Big Data Tools: Hands-on experience with big data technologies such as Apache Spark, Hadoop, and Kafka.
  • Cloud Computing: Knowledge of cloud platforms like Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure.
  • ETL Tools: Familiarity with ETL tools like Apache Airflow, Talend, or Stitch.
  • Soft Skills:
  • Strong analytical and problem-solving abilities.
  • Excellent communication and teamwork skills to collaborate effectively with various teams.
  • Ability to work independently in a dynamic environment.

Competencies

  • Technical:
  • Architecture Design
  • Business Needs Analysis
  • Data Analysis and Interpretation
  • Infrastructure Design
  • Software Design
  • Solution Architecture
  • System Architecture Design
  • System Configuration Management
  • System Integration
  • Leadership:
  • Applied Learning
  • Building Customer Loyalty
  • Business Awareness
  • Collaborating
  • Continuous Improvement
  • Planning & Organizing
  • Quality Orientation
Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent