Course Content
Python Indentation, Comments and Variables
0/2
Object Oriented Programming in Python
0/1
Exception Handling in Python
0/1
Sending emails with Python
0/1
Unit test in python programming
0/1
Python programming (zero to advance!!!)
About Lesson

 DATA MANIPULATION:

Pandas provides a wide range of functions for data manipulation, including indexing, slicing, merging, reshaping, and pivoting data.

Data manipulation in Pandas involves performing various operations on DataFrame and Series objects to modify, filter, transform, and analyze data. Some common data manipulation tasks include:

a. Selecting Data: Extracting specific rows, columns, or subsets of data from a DataFrame based on certain criteria.

b. Filtering Data: Filtering out rows or columns from a DataFrame that meet certain conditions.

c. Sorting Data: Sorting the rows of a DataFrame based on the values of one or more columns.

d. Grouping Data: Grouping rows of a DataFrame based on the values of one or more columns and performing aggregate operations on each group.

e. Aggregating Data: Computing summary statistics (e.g., mean, median, sum) for groups of data or for entire columns.

f. Applying Functions: Applying custom functions to elements, rows, or columns of a DataFrame.

g. Handling Missing Data: Identifying and handling missing or null values in a DataFrame, such as filling missing values or dropping rows or columns with missing data.

h. Merging and Joining Data: Combining multiple DataFrames into a single DataFrame based on common keys or indices.

i. Reshaping Data: Transforming the layout or structure of a DataFrame, such as pivoting, melting, or stacking.

j. Appending and Concatenating Data: Adding new rows or columns to a DataFrame, or combining multiple DataFrames along rows or columns.

Here’s an illustrative code example demonstrating some of these data manipulation tasks using Pandas:

“`python
import pandas as pd

# Create a sample DataFrame
data = {‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’, ‘David’, ‘Eve’],
‘Age’: [25, 30, 35, 40, 45],
‘Salary’: [50000, 60000, 70000, 80000, 90000]}
df = pd.DataFrame(data)

# Selecting data
print(df[‘Name’]) # Selecting a single column
print(df[[‘Name’, ‘Age’]]) # Selecting multiple columns

# Filtering data
print(df[df[‘Age’] > 30]) # Filtering rows based on a condition

# Sorting data
print(df.sort_values(by=’Age’, ascending=False)) # Sorting rows by age in descending order

# Grouping and aggregating data
print(df.groupby(‘Age’)[‘Salary’].mean()) # Computing the average salary for each age group

# Applying functions
print(df[‘Name’].apply(lambda x: x.upper())) # Applying a function to each element in the ‘Name’ column

# Handling missing data
df.loc[2, ‘Salary’] = None # Introducing a missing value
print(df.dropna()) # Dropping rows with missing values

# Merging and joining data
data2 = {‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’],
‘Department’: [‘HR’, ‘Finance’, ‘IT’]}
df2 = pd.DataFrame(data2)
print(pd.merge(df, df2, on=’Name’)) # Inner join based on the ‘Name’ column

# Reshaping data
print(df.melt(id_vars=[‘Name’], value_vars=[‘Age’, ‘Salary’])) # Melting the DataFrame

# Appending and concatenating data
df3 = pd.DataFrame({‘Name’: [‘Frank’], ‘Age’: [50], ‘Salary’: [95000]})
print(pd.concat([df, df3])) # Appending a new row to the DataFrame
“`

This code demonstrates various data manipulation operations such as selecting columns, filtering rows, sorting, grouping, applying functions, handling missing data, merging, reshaping, appending, and concatenating DataFrames using Pandas. These operations are essential for cleaning, transforming, and analyzing datasets in data science and data analysis tasks.

Join the conversation