Answers for "pd drop duplicate rows"

drop duplicate index pandas

df3 = df3[~df3.index.duplicated(keep='first')]

Posted by: Guest on November-12-2020

Return a new DataFrame with duplicate rows removed

# Return a new DataFrame with duplicate rows removed

from pyspark.sql import Row
df = sc.parallelize([
  Row(name='Alice', age=5, height=80),
  Row(name='Alice', age=5, height=80),
  Row(name='Alice', age=10, height=80)]).toDF()
df.dropDuplicates().show()
# +---+------+-----+
# |age|height| name|
# +---+------+-----+
# |  5|    80|Alice|
# | 10|    80|Alice|
# +---+------+-----+

df.dropDuplicates(['name', 'height']).show()
# +---+------+-----+
# |age|height| name|
# +---+------+-----+
# |  5|    80|Alice|
# +---+------+-----+

Posted by: Guest on April-08-2020

Code answers related to "pd drop duplicate rows"

Code answers related to "Python"

Python Answers by Framework

Browse Popular Code Answers by Language

Answers for "pd drop duplicate rows"

Code answers related to "pd drop duplicate rows"

Code answers related to "Python"

Python Answers by Framework

Browse Popular Code Answers by Language

Answers for "pd drop duplicate rows"

Code answers related to "pd drop duplicate rows"

Code answers related to "Python"

Python Answers by Framework

Browse Popular Code Answers by Language

Popular Programming Languages

Advertisements

Company

Compilers

Help

Connect with us