Updated On : Nov-11,2021 Time Investment : ~30 mins

# xarray: Simple Guide to Labeled N-Dimensional Array (DataArray)¶

Xarray is a python library that lets us create N-dimensional arrays just like numpy but it let us name the dimension of the N-dimensional array as well. Apart from letting us specify a name for dimensions, it let us specify coordinates data for each dimension. It also lets us record some attributes with our n-dimensional array. All the operations that we perform on a numpy array using integer indexing can be performed on xarray array as well but all those operations can be performed using dimension names as well. The code written using xarray becomes more intuitive as we use dimension names instead of integer indexing. The concept of dimensions, coordinates, and attributes will become more clear when we explain arrays with examples below.

Xarray provides two important data structures to store data.

• DataArray - It's a data structure that is used to represent an N-dimensional array.
• Dataset - It's a data structure that is used to represent a multi-dimensional array which is dict-like container holding DataArray objects. The DataArray objects are aligned across shared dimensions.

As a part of this tutorial, we'll be discussing only DataArray data structure. We'll explain with simple examples how to create them, perform indexing, normal array operations, and simple statistics. If you have come to learn about Dataset data structure then please feel free to check the below tutorial where we have covered it in detail with examples.

Below we have highlighted important sections of the tutorial to give an overview of the material covered.

### Important Sections of Tutorial¶

1. DataArray Creation
• Creation From Numpy Array
• DataArray with Attributes
• Creation From Pandas Series
• Creation From DataFrame
2. Indexing DataArray
• Numpy Like Integer Indexing
• Pandas Like Indexing using .loc Property
• Integer Indexing using isel() Function
• Indexing Based on Dimension Data using sel() Function
3. Normal Array Operations
4. Simple Statistics

We have imported all necessary libraries at the beginning of our tutorial.

```import xarray as xr

print("Xarray Version : {}".format(xr.__version__))
```
```Xarray Version : 0.20.1
```
```import numpy as np

print("Numpy Version : {}".format(np.__version__))
```
```Numpy Version : 1.20.3
```
```import pandas as pd

print("Pandas Version : {}".format(pd.__version__))
```
```Pandas Version : 1.3.4
```

## 1. DataArray Creation ¶

In this section, we'll explain various ways of creating a xarray DataArray object. We'll explore different methods available from xarray to create arrays.

#### Creation From Numpy Array¶

The first and the most simple way to create a DataArray is by using DataArray() constructor available from xarray. We can provide a numpy array or python list, pandas series object, and pandas dataframe object to this constructor to create DataArray object. Below we have highlighted the signature of DataArray() constructor for reference purposes.

• DataArray(data, dims=None,coords=None,attrs=None,name=None) - This constructor takes as input numpy array, python list, pandas series or pandas dataframe and creates an instance of DataArray. All other parameters are optional.
• The dims parameter accepts a list of names specified as strings to define dimension names for each dimension of the array. For 1D array we need to provide a list with one name, for 2D array we need to provide a list with 2 names, for 3D we need to provide a list with 3 names, and so on.
• The coords parameter accepts dictionary specifying values for each dimension which will be used when indexing an array. The key of the dictionary is the name of the dimension and the value is a list of the same length as the number of values in that dimension. E.g - For 2D array of shape 3x5, we can provide a dictionary with 2 dimensions where one will have a list of 3 values and the other will have a list of 5 values.
• The attrs parameter accepts a dictionary which will be a list of attributes that we want to attach with this array describing it.
• The name parameter accepts string specifying the name of the array.

Below we have created our first xarray DataArray using a random numpy array of shape (5,). As it is 1D array, we have given dims parameter with a single name. We have given index name to the single dimension of our array.

```arr = xr.DataArray(data=np.random.rand(5), dims=["index"])

arr
```
```<xarray.DataArray (index: 5)>
array([0.97636211, 0.76268531, 0.53293316, 0.38971404, 0.84243048])
Dimensions without coordinates: index```
xarray.DataArray
• index: 5
• 0.9764 0.7627 0.5329 0.3897 0.8424
`array([0.97636211, 0.76268531, 0.53293316, 0.38971404, 0.84243048])`

Below we have created another example where we have created a 2D DataArray of shape 4x5 using a numpy array of random numbers. We have specified two-dimension names this time as we have 2D array.

```arr = xr.DataArray(data=np.random.rand(4,5), dims=["index", "columns"])

arr
```
```<xarray.DataArray (index: 4, columns: 5)>
array([[0.48648517, 0.12542794, 0.4972441 , 0.69972002, 0.32564098],
[0.94822908, 0.87763739, 0.20857022, 0.9199263 , 0.88037042],
[0.62336462, 0.11829816, 0.27168636, 0.77116992, 0.77662334],
[0.76880574, 0.53286298, 0.06375732, 0.38386554, 0.04482307]])
Dimensions without coordinates: index, columns```
xarray.DataArray
• index: 4
• columns: 5
• 0.4865 0.1254 0.4972 0.6997 0.3256 ... 0.5329 0.06376 0.3839 0.04482
```array([[0.48648517, 0.12542794, 0.4972441 , 0.69972002, 0.32564098],
[0.94822908, 0.87763739, 0.20857022, 0.9199263 , 0.88037042],
[0.62336462, 0.11829816, 0.27168636, 0.77116992, 0.77662334],
[0.76880574, 0.53286298, 0.06375732, 0.38386554, 0.04482307]])```

We can access data of our array anytime using data attribute of DataArray object.

```arr.data
```
```array([[0.48648517, 0.12542794, 0.4972441 , 0.69972002, 0.32564098],
[0.94822908, 0.87763739, 0.20857022, 0.9199263 , 0.88037042],
[0.62336462, 0.11829816, 0.27168636, 0.77116992, 0.77662334],
[0.76880574, 0.53286298, 0.06375732, 0.38386554, 0.04482307]])```

Other array attributes like dtype, shape, size, ndim, nbytes which are available for numpy array are also available for DataArray. The nbytes attribute returns a total number of bytes taken by an array which is 160 (20*8) in this case (20 floats elements each of size 8 bytes).

```arr.dtype
```
`dtype('float64')`
```arr.nbytes
```
`160`
```arr.ndim
```
`2`
```arr.shape
```
`(4, 5)`
```arr.size
```
`20`
```arr.sizes
```
`Frozen({'index': 4, 'columns': 5})`

Below we have created another example explaining how we can create DataArray of 3D shape.

```arr = xr.DataArray(data=np.random.rand(2,3,4), dims=["index", "columns", "items"])

arr
```
```<xarray.DataArray (index: 2, columns: 3, items: 4)>
array([[[0.47468654, 0.30231721, 0.65516318, 0.92652759],
[0.4320954 , 0.97064867, 0.63535385, 0.6786689 ],
[0.85087508, 0.40156857, 0.83255594, 0.67374223]],

[[0.78386596, 0.63289745, 0.78499957, 0.62841028],
[0.21529929, 0.03341366, 0.12401273, 0.79578469],
[0.68887276, 0.63861678, 0.19319422, 0.83450311]]])
Dimensions without coordinates: index, columns, items```
xarray.DataArray
• index: 2
• columns: 3
• items: 4
• 0.4747 0.3023 0.6552 0.9265 0.4321 ... 0.6889 0.6386 0.1932 0.8345
```array([[[0.47468654, 0.30231721, 0.65516318, 0.92652759],
[0.4320954 , 0.97064867, 0.63535385, 0.6786689 ],
[0.85087508, 0.40156857, 0.83255594, 0.67374223]],

[[0.78386596, 0.63289745, 0.78499957, 0.62841028],
[0.21529929, 0.03341366, 0.12401273, 0.79578469],
[0.68887276, 0.63861678, 0.19319422, 0.83450311]]])```

In all our previous examples, we only specified dimension names of DataArray but we did not specify coordinates for those dimensions. Now, we'll explain how we can include coordinates for the dimensions of an array.

Below we have created a 2D DataArray using a random numpy array. We have specified coordinates of our array by providing a dictionary to coords parameter. We have defined two dimensions of data (index, columns). The index represents the first dimension of size 4 and columns represents the second dimension of size 5. We have provided a simple python list of size 4 for index dimension and a list of strings for columns dimension. Apart from specifying coordinates, we have also specified the name of an array using name parameter.

When we define an array using dimension values like this, we can access subarray and elements of an array using these values for indexing. We'll be explaining how we can use these values to perform indexing in the upcoming section of the tutorial.

```arr1 = xr.DataArray(data=np.random.rand(4,5), dims=['index','columns'],
coords={"index": [0,1,2,3], "columns": list("ABCDE")},
name="Array1"
)

arr1
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595 ... 0.6277 0.6619 0.4117 0.5048
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have created another DataArray using a random numpy array. This time we have specified index dimension values as a list of strings, unlike our previous examples where values were a list of integers.

We'll be using these arrays during the indexing section to explain indexing in different ways using these coordinate values.

```arr2 = xr.DataArray(data=np.random.rand(4,5),
dims=['index','columns'],
coords={"index": ['0','1','2','3'], "columns": list("ABCDE")},
name="Array2"
)

arr2
```
```<xarray.DataArray 'Array2' (index: 4, columns: 5)>
array([[0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767],
[0.73421379, 0.7067142 , 0.24650569, 0.2074986 , 0.41164924],
[0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385],
[0.36282616, 0.63435477, 0.56852942, 0.81083044, 0.46026918]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array2'
• index: 4
• columns: 5
• 0.07511 0.6039 0.749 0.2558 0.7911 ... 0.6344 0.5685 0.8108 0.4603
```array([[0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767],
[0.73421379, 0.7067142 , 0.24650569, 0.2074986 , 0.41164924],
[0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385],
[0.36282616, 0.63435477, 0.56852942, 0.81083044, 0.46026918]])```
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have created another DataArray of shape 4x5 whose data is a random numpy array. This time we have specified index dimension value as a list of dates. We have used the pandas date_range() function to create a list of dates starting from 2020-1-1.

```arr3 = xr.DataArray(data=np.random.rand(4,5),
dims=['index','columns'],
coords={"index": pd.date_range(start="2021-01-01", freq="D", periods=4),
"columns": list("ABCDE")},
name="Array3"
)

arr3
```
```<xarray.DataArray 'Array3' (index: 4, columns: 5)>
array([[0.39792208, 0.79787484, 0.94760726, 0.01103115, 0.34796905],
[0.21345645, 0.89753226, 0.00395103, 0.66829528, 0.11539251],
[0.94518946, 0.21601817, 0.05817   , 0.49979745, 0.89442209],
[0.00257528, 0.57121823, 0.67385832, 0.87298376, 0.36179141]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-02 2021-01-03 2021-01-04
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array3'
• index: 4
• columns: 5
• 0.3979 0.7979 0.9476 0.01103 0.348 ... 0.5712 0.6739 0.873 0.3618
```array([[0.39792208, 0.79787484, 0.94760726, 0.01103115, 0.34796905],
[0.21345645, 0.89753226, 0.00395103, 0.66829528, 0.11539251],
[0.94518946, 0.21601817, 0.05817   , 0.49979745, 0.89442209],
[0.00257528, 0.57121823, 0.67385832, 0.87298376, 0.36179141]])```
• index
(index)
datetime64[ns]
2021-01-01 ... 2021-01-04
```array(['2021-01-01T00:00:00.000000000', '2021-01-02T00:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-04T00:00:00.000000000'],
dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### DataArray with Attributes¶

In this section, we have explained how we can create an array with attributes.

We have created a DataArray of shape 4x5 using a random numpy array. We have specified dimensions and coordinates like we were doing till now. Apart from that, we have provided a dictionary to attrs parameter explaining our dataset. We can describe our data, dimensions, and coordinates in this dictionary.

```arr = xr.DataArray(
data=np.random.rand(4,5),
dims=['index','columns'],
coords={"index": ['0','1','2','3'], "columns": list("ABCDE")},
attrs={"index": "X-Dimension of Data",
"columns": "Y-Dimension of Data",
"info": "Pandas DataFrame",
"long_name": "Random Data",
"units": "Unknown"
},
name="Array"
)

arr
```
```<xarray.DataArray 'Array' (index: 4, columns: 5)>
array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'
Attributes:
index:      X-Dimension of Data
columns:    Y-Dimension of Data
info:       Pandas DataFrame
long_name:  Random Data
units:      Unknown```
xarray.DataArray
'Array'
• index: 4
• columns: 5
• 0.3873 0.2311 0.6696 0.6708 0.9583 ... 0.5694 0.5903 0.5371 0.4598
```array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435]])```
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
• index :
X-Dimension of Data
columns :
Y-Dimension of Data
info :
Pandas DataFrame
long_name :
Random Data
units :
Unknown

We can access attributes of our DataArray using attrs attribute anytime.

```arr.attrs
```
```{'index': 'X-Dimension of Data',
'columns': 'Y-Dimension of Data',
'info': 'Pandas DataFrame',
'long_name': 'Random Data',
'units': 'Unknown'}```
```arr.attrs["index"]
```
`'X-Dimension of Data'`
```arr.attrs["long_name"]
```
`'Random Data'`

#### Creation From Pandas Series¶

In this section, we have explained how we can create DataArray from the pandas series.

Below we have first created a pandas series with index and data.

```ser = pd.Series([1,2,3,4], index=list("ABCD"),name="col")

ser
```
```A    1
B    2
C    3
D    4
Name: col, dtype: int64```

We can create DataArray by just giving pandas series as input. It'll take dimension and coordinate data based on index values of series.

```arr_ser = xr.DataArray(ser)

arr_ser
```
```<xarray.DataArray 'col' (dim_0: 4)>
array([1, 2, 3, 4])
Coordinates:
* dim_0    (dim_0) object 'A' 'B' 'C' 'D'```
xarray.DataArray
'col'
• dim_0: 4
• 1 2 3 4
`array([1, 2, 3, 4])`
• dim_0
(dim_0)
object
'A' 'B' 'C' 'D'
`array(['A', 'B', 'C', 'D'], dtype=object)`

#### Creation From DataFrame¶

In this section, we have explained how we can create DataArray from pandas dataframe.

Below we have created pandas dataframe with random data. We have also provided dataframe index values and column names.

```df = pd.DataFrame(np.random.rand(4,5), index=[0,1,2,3], columns=list("ABCDE"))

df
```
A B C D E
0 0.236578 0.285889 0.370095 0.357964 0.162042
1 0.324387 0.495267 0.203329 0.352109 0.566172
2 0.163010 0.381800 0.082297 0.831716 0.842050
3 0.559487 0.871914 0.340260 0.459081 0.346937

We can create DataArray from pandas dataframe directly. It'll take dimension and coordinate values based on index and column names of pandas dataframe.

```arr_df = xr.DataArray(df)

arr_df
```
```<xarray.DataArray (dim_0: 4, dim_1: 5)>
array([[0.23657771, 0.28588863, 0.37009544, 0.35796388, 0.16204199],
[0.32438665, 0.49526733, 0.20332903, 0.35210868, 0.56617198],
[0.16300996, 0.38179992, 0.08229747, 0.83171561, 0.8420505 ],
[0.55948712, 0.87191389, 0.34025972, 0.45908091, 0.34693702]])
Coordinates:
* dim_0    (dim_0) int64 0 1 2 3
* dim_1    (dim_1) object 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
• dim_0: 4
• dim_1: 5
• 0.2366 0.2859 0.3701 0.358 0.162 ... 0.8719 0.3403 0.4591 0.3469
```array([[0.23657771, 0.28588863, 0.37009544, 0.35796388, 0.16204199],
[0.32438665, 0.49526733, 0.20332903, 0.35210868, 0.56617198],
[0.16300996, 0.38179992, 0.08229747, 0.83171561, 0.8420505 ],
[0.55948712, 0.87191389, 0.34025972, 0.45908091, 0.34693702]])```
• dim_0
(dim_0)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• dim_1
(dim_1)
object
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype=object)`

## 2. Indexing DataArray ¶

In this section, we'll explain how we can perform indexing operations on xarray DataArray. We can do normal numpy indexing using integers as well as indexing using coordinate values that we specified when creating arrays. We'll be performing indexing on arrays that we created during the array creation section earlier.

#### Numpy Like Integer Indexing¶

In this section, we have performed normal numpy-like integer indexing on our xarray DataArray.

Below we have accessed the 0th element of our 2D array which we created earlier.

```arr1[0]
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])
Coordinates:
index    int64 0
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595
`array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])`
• index
()
int64
0
`array(0)`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have accessed all elements of the first dimension and the 0th elements of the second dimension. This will be like accessing 1 column of 2D array.

```arr1[:, 0]
```
```<xarray.DataArray 'Array1' (index: 4)>
array([0.57868507, 0.46849765, 0.93084546, 0.24271528])
Coordinates:
* index    (index) int64 0 1 2 3
columns  <U1 'A'```
xarray.DataArray
'Array1'
• index: 4
• 0.5787 0.4685 0.9308 0.2427
`array([0.57868507, 0.46849765, 0.93084546, 0.24271528])`
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
()
<U1
'A'
`array('A', dtype='<U1')`

Below we have accessed the 0th and 1st row of our data.

```arr1[[0,1]]
```
```<xarray.DataArray 'Array1' (index: 2, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703]])
Coordinates:
* index    (index) int64 0 1
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 2
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595 0.4685 0.07264 0.2016 0.9947 0.9349
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703]])```
• index
(index)
int64
0 1
`array([0, 1])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have accessed the 0th and 1st column of our 2D array.

```arr1[:,[0,1]]
```
```<xarray.DataArray 'Array1' (index: 4, columns: 2)>
array([[0.57868507, 0.78605464],
[0.46849765, 0.07263884],
[0.93084546, 0.24244413],
[0.24271528, 0.62774479]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array1'
• index: 4
• columns: 2
• 0.5787 0.7861 0.4685 0.07264 0.9308 0.2424 0.2427 0.6277
```array([[0.57868507, 0.78605464],
[0.46849765, 0.07263884],
[0.93084546, 0.24244413],
[0.24271528, 0.62774479]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

Below we have accessed 2D array of shape 2x2 from our original 4x5 array.

```arr1[[1,2],[0,1]]
```
```<xarray.DataArray 'Array1' (index: 2, columns: 2)>
array([[0.46849765, 0.07263884],
[0.93084546, 0.24244413]])
Coordinates:
* index    (index) int64 1 2
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array1'
• index: 2
• columns: 2
• 0.4685 0.07264 0.9308 0.2424
```array([[0.46849765, 0.07263884],
[0.93084546, 0.24244413]])```
• index
(index)
int64
1 2
`array([1, 2])`
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

#### Pandas Like Indexing using .loc Property¶

The xarray DataArray provided loc property which we can use to index arrays as we do with pandas dataframe. The loc property let us specify coordinates values that we had provided when we created the array. The coordinates values can be of any type (string, date, time, etc), not only integer.

Below we have accessed the first element of the first dimension of our DataArray which we created earlier.

```arr1.loc[0]
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])
Coordinates:
index    int64 0
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595
`array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])`
• index
()
int64
0
`array(0)`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have accessed the sub-array by using loc property. We have accessed the sub-array which crosses the 0th element of the first dimension and the first two values of the second dimension. We have used string values for indexing DataArray this time.

```arr1.loc[0, ["A","B"]]
```
```<xarray.DataArray 'Array1' (columns: 2)>
array([0.57868507, 0.78605464])
Coordinates:
index    int64 0
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array1'
• columns: 2
• 0.5787 0.7861
`array([0.57868507, 0.78605464])`
• index
()
int64
0
`array(0)`
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

Below we have accessed the first value of the 0th dimension of our DataArray which we created earlier using loc property. We have a string value to access the value.

```arr2.loc['0']
```
```<xarray.DataArray 'Array2' (columns: 5)>
array([0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767])
Coordinates:
index    <U1 '0'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array2'
• columns: 5
• 0.07511 0.6039 0.749 0.2558 0.7911
`array([0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767])`
• index
()
<U1
'0'
`array('0', dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have accessed another sub-array from our original DataArray using all indices as string values inside of loc property.

```arr2.loc['0', ["A","B","C"]]
```
```<xarray.DataArray 'Array2' (columns: 3)>
array([0.07511355, 0.60393655, 0.74898288])
Coordinates:
index    <U1 '0'
* columns  (columns) <U1 'A' 'B' 'C'```
xarray.DataArray
'Array2'
• columns: 3
• 0.07511 0.6039 0.749
`array([0.07511355, 0.60393655, 0.74898288])`
• index
()
<U1
'0'
`array('0', dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C'
`array(['A', 'B', 'C'], dtype='<U1')`

Below we have accessed the sub-array from our array where we had first dimension coordinates specified as date values. We have specified the date value as a string.

```arr3.loc["2021-1-1", ['A','B']]
```
```<xarray.DataArray 'Array3' (columns: 2)>
array([0.39792208, 0.79787484])
Coordinates:
index    datetime64[ns] 2021-01-01
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array3'
• columns: 2
• 0.3979 0.7979
`array([0.39792208, 0.79787484])`
• index
()
datetime64[ns]
2021-01-01
`array('2021-01-01T00:00:00.000000000', dtype='datetime64[ns]')`
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

Below we have created another example where we are accessing sub-array from our array with date dimension. We have specified list dates as strings this time to access the sub-array.

```arr3.loc[["2021-1-1","2021-1-3"], ['A','B']]
```
```<xarray.DataArray 'Array3' (index: 2, columns: 2)>
array([[0.39792208, 0.79787484],
[0.94518946, 0.21601817]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-03
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array3'
• index: 2
• columns: 2
• 0.3979 0.7979 0.9452 0.216
```array([[0.39792208, 0.79787484],
[0.94518946, 0.21601817]])```
• index
(index)
datetime64[ns]
2021-01-01 2021-01-03
```array(['2021-01-01T00:00:00.000000000', '2021-01-03T00:00:00.000000000'],
dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

In this example, we have accessed sub-array from our date dimension array by providing date dimension coordinates as a list of dates. We have created a list of 3 dates using date_range() function and provided it to filter first dimension values.

```three_days = pd.date_range(start="2021-1-1",periods=3)

arr3.loc[three_days, ["A","B","C"]]
```
```<xarray.DataArray 'Array3' (index: 3, columns: 3)>
array([[0.39792208, 0.79787484, 0.94760726],
[0.21345645, 0.89753226, 0.00395103],
[0.94518946, 0.21601817, 0.05817   ]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-02 2021-01-03
* columns  (columns) <U1 'A' 'B' 'C'```
xarray.DataArray
'Array3'
• index: 3
• columns: 3
• 0.3979 0.7979 0.9476 0.2135 0.8975 0.003951 0.9452 0.216 0.05817
```array([[0.39792208, 0.79787484, 0.94760726],
[0.21345645, 0.89753226, 0.00395103],
[0.94518946, 0.21601817, 0.05817   ]])```
• index
(index)
datetime64[ns]
2021-01-01 2021-01-02 2021-01-03
```array(['2021-01-01T00:00:00.000000000', '2021-01-02T00:00:00.000000000',
'2021-01-03T00:00:00.000000000'], dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C'
`array(['A', 'B', 'C'], dtype='<U1')`

#### Integer Indexing using isel() Function¶

The xarray DataArray has a method named isel() which lets us specify dimension values as integers and access the sub-array of the original array based on values provided to it.

In order to perform indexing using isel() method, we can provide dimension names and their values either as a dictionary or we can provide them as if they are parameters of the methods as well. We'll explain with examples below how we can use this method to perform indexing to make things clear.

Below we have retrieved the 0th element of the 'index' dimension of the array using isel() method. We have provided value to the dimension as if it is a parameter of the method.

```arr1.isel(index=0)
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])
Coordinates:
index    int64 0
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595
`array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])`
• index
()
int64
0
`array(0)`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have recreated our previous example by providing coordinate value for dimension as a dictionary. This has the same effect as the previous cell.

```arr1.isel({'index':0})
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])
Coordinates:
index    int64 0
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595
`array([0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ])`
• index
()
int64
0
`array(0)`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have tried to retrieve 2D array of shape 2x4 using isel() method. We have provided two coordinate values for the 'index' dimension and 4 coordinates values for the 'columns' dimension.

```arr1.isel(index=[0,1], columns=[0,1,2,3])
```
```<xarray.DataArray 'Array1' (index: 2, columns: 4)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705],
[0.46849765, 0.07263884, 0.20157703, 0.99471873]])
Coordinates:
* index    (index) int64 0 1
* columns  (columns) <U1 'A' 'B' 'C' 'D'```
xarray.DataArray
'Array1'
• index: 2
• columns: 4
• 0.5787 0.7861 0.9039 0.8501 0.4685 0.07264 0.2016 0.9947
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705],
[0.46849765, 0.07263884, 0.20157703, 0.99471873]])```
• index
(index)
int64
0 1
`array([0, 1])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D'
`array(['A', 'B', 'C', 'D'], dtype='<U1')`

Below we have recreated our previous example by providing coordinate values as a dictionary.

```arr1.isel({'index':[0,1], 'columns':[0,1,2,3]})
```
```<xarray.DataArray 'Array1' (index: 2, columns: 4)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705],
[0.46849765, 0.07263884, 0.20157703, 0.99471873]])
Coordinates:
* index    (index) int64 0 1
* columns  (columns) <U1 'A' 'B' 'C' 'D'```
xarray.DataArray
'Array1'
• index: 2
• columns: 4
• 0.5787 0.7861 0.9039 0.8501 0.4685 0.07264 0.2016 0.9947
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705],
[0.46849765, 0.07263884, 0.20157703, 0.99471873]])```
• index
(index)
int64
0 1
`array([0, 1])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D'
`array(['A', 'B', 'C', 'D'], dtype='<U1')`

#### Indexing Based on Dimension Data using sel() Function¶

The xarray DataArray provides a method named sel() which works like isel() but it can accept the actual value of coordinates to access sub-arrays rather than integer indexing. We can provide values as either dictionary or as if they are parameters of the method.

Below we have retrieved a sub-array of shape 3x5 from our original array using sel() method. The 'index' dimension has coordinate values as integers hence we have provided them as integers.

```arr1.sel(index=[0,1,2])
```
```<xarray.DataArray 'Array1' (index: 3, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558]])
Coordinates:
* index    (index) int64 0 1 2
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 3
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595 ... 0.2424 0.8259 0.8199 0.6852
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558]])```
• index
(index)
int64
0 1 2
`array([0, 1, 2])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have tried to access the sub-array of shape 3x5 from our original array using sel() method. This time we have provided coordinate values as a list of strings because original arrays have 'index' dimension values stored as integers.

```arr2.sel(index=['0','1','2'])
```
```<xarray.DataArray 'Array2' (index: 3, columns: 5)>
array([[0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767],
[0.73421379, 0.7067142 , 0.24650569, 0.2074986 , 0.41164924],
[0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385]])
Coordinates:
* index    (index) <U1 '0' '1' '2'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array2'
• index: 3
• columns: 5
• 0.07511 0.6039 0.749 0.2558 0.7911 ... 0.6452 0.6661 0.1698 0.6782
```array([[0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767],
[0.73421379, 0.7067142 , 0.24650569, 0.2074986 , 0.41164924],
[0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385]])```
• index
(index)
<U1
'0' '1' '2'
`array(['0', '1', '2'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have accessed another 3x3 array from our original array using sel() method. We have provided coordinate values for both dimensions as a list of strings.

```arr2.sel(index=['0','1','2'], columns=['A','C','E'])
```
```<xarray.DataArray 'Array2' (index: 3, columns: 3)>
array([[0.07511355, 0.74898288, 0.79114767],
[0.73421379, 0.24650569, 0.41164924],
[0.50616351, 0.66608194, 0.67817385]])
Coordinates:
* index    (index) <U1 '0' '1' '2'
* columns  (columns) <U1 'A' 'C' 'E'```
xarray.DataArray
'Array2'
• index: 3
• columns: 3
• 0.07511 0.749 0.7911 0.7342 0.2465 0.4116 0.5062 0.6661 0.6782
```array([[0.07511355, 0.74898288, 0.79114767],
[0.73421379, 0.24650569, 0.41164924],
[0.50616351, 0.66608194, 0.67817385]])```
• index
(index)
<U1
'0' '1' '2'
`array(['0', '1', '2'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'C' 'E'
`array(['A', 'C', 'E'], dtype='<U1')`

Below we have created another example demonstrating the use of sel() method. We are accessing a sub-array of dimension which holds dates.

```arr3.sel(index=["2021-1-1","2021-1-2", "2021-1-3"], columns=['A','B'])
```
```<xarray.DataArray 'Array3' (index: 3, columns: 2)>
array([[0.39792208, 0.79787484],
[0.21345645, 0.89753226],
[0.94518946, 0.21601817]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-02 2021-01-03
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array3'
• index: 3
• columns: 2
• 0.3979 0.7979 0.2135 0.8975 0.9452 0.216
```array([[0.39792208, 0.79787484],
[0.21345645, 0.89753226],
[0.94518946, 0.21601817]])```
• index
(index)
datetime64[ns]
2021-01-01 2021-01-02 2021-01-03
```array(['2021-01-01T00:00:00.000000000', '2021-01-02T00:00:00.000000000',
'2021-01-03T00:00:00.000000000'], dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

Below we have created one more example demonstrating the use of sel() method. We have created a list of dates using the pandas date_range() function to access the sub-array based on it. We have provided this list of dates to the 'index' dimension of an array. For other 'columns' dimension, we have provided a list of 3 strings.

```three_days = pd.date_range(start="2021-1-1",periods=3)

arr3.sel(index=three_days, columns=['A','B', 'C'])
```
```<xarray.DataArray 'Array3' (index: 3, columns: 3)>
array([[0.39792208, 0.79787484, 0.94760726],
[0.21345645, 0.89753226, 0.00395103],
[0.94518946, 0.21601817, 0.05817   ]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-02 2021-01-03
* columns  (columns) <U1 'A' 'B' 'C'```
xarray.DataArray
'Array3'
• index: 3
• columns: 3
• 0.3979 0.7979 0.9476 0.2135 0.8975 0.003951 0.9452 0.216 0.05817
```array([[0.39792208, 0.79787484, 0.94760726],
[0.21345645, 0.89753226, 0.00395103],
[0.94518946, 0.21601817, 0.05817   ]])```
• index
(index)
datetime64[ns]
2021-01-01 2021-01-02 2021-01-03
```array(['2021-01-01T00:00:00.000000000', '2021-01-02T00:00:00.000000000',
'2021-01-03T00:00:00.000000000'], dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C'
`array(['A', 'B', 'C'], dtype='<U1')`

## 3. Normal Array Operations ¶

In this section, we'll explain some of the commonly performed operations with arrays like addition, multiplication with scalar, transpose, dot product, null elements check, etc. We'll try to explain as many simple operations as possible with simple examples.

#### Transpose¶

We can retrieve the transpose of an array by calling T attribute on the array or by calling transpose() method on it.

```arr1_transpose = arr1.T # arr1.transpose() works same

arr1_transpose
```
```<xarray.DataArray 'Array1' (columns: 5, index: 4)>
array([[0.57868507, 0.46849765, 0.93084546, 0.24271528],
[0.78605464, 0.07263884, 0.24244413, 0.62774479],
[0.90389917, 0.20157703, 0.82591196, 0.66185214],
[0.85013705, 0.99471873, 0.81989938, 0.41166893],
[0.5950187 , 0.93488703, 0.68520558, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• index: 4
• 0.5787 0.4685 0.9308 0.2427 0.7861 ... 0.595 0.9349 0.6852 0.5048
```array([[0.57868507, 0.46849765, 0.93084546, 0.24271528],
[0.78605464, 0.07263884, 0.24244413, 0.62774479],
[0.90389917, 0.20157703, 0.82591196, 0.66185214],
[0.85013705, 0.99471873, 0.81989938, 0.41166893],
[0.5950187 , 0.93488703, 0.68520558, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

We can easily multiply, add, subtract and perform many other operations using scalar.

```arr1 * 10
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[5.78685073, 7.8605464 , 9.03899168, 8.50137048, 5.95018697],
[4.68497649, 0.72638835, 2.01577033, 9.94718734, 9.34887027],
[9.30845463, 2.42444133, 8.25911956, 8.19899378, 6.85205578],
[2.42715276, 6.27744792, 6.61852136, 4.11668931, 5.04761172]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 5.787 7.861 9.039 8.501 5.95 4.685 ... 2.427 6.277 6.619 4.117 5.048
```array([[5.78685073, 7.8605464 , 9.03899168, 8.50137048, 5.95018697],
[4.68497649, 0.72638835, 2.01577033, 9.94718734, 9.34887027],
[9.30845463, 2.42444133, 8.25911956, 8.19899378, 6.85205578],
[2.42715276, 6.27744792, 6.61852136, 4.11668931, 5.04761172]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

We can add arrays of the same shape only if dimension names and coordinate values match between them.

That's the reason below we are adding our first array to itself to demonstrate array addition because all our arrays created earlier have different coordinate values.

```arr1 + arr1
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[1.15737015, 1.57210928, 1.80779834, 1.7002741 , 1.19003739],
[0.9369953 , 0.14527767, 0.40315407, 1.98943747, 1.86977405],
[1.86169093, 0.48488827, 1.65182391, 1.63979876, 1.37041116],
[0.48543055, 1.25548958, 1.32370427, 0.82333786, 1.00952234]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 1.157 1.572 1.808 1.7 1.19 0.937 ... 0.4854 1.255 1.324 0.8233 1.01
```array([[1.15737015, 1.57210928, 1.80779834, 1.7002741 , 1.19003739],
[0.9369953 , 0.14527767, 0.40315407, 1.98943747, 1.86977405],
[1.86169093, 0.48488827, 1.65182391, 1.63979876, 1.37041116],
[0.48543055, 1.25548958, 1.32370427, 0.82333786, 1.00952234]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
```arr + arr2
```
```<xarray.DataArray (index: 4, columns: 5)>
array([[0.46244583, 0.83503292, 1.41862553, 0.92661634, 1.74944742],
[1.50557018, 0.8233929 , 0.89481389, 0.96159213, 1.18065456],
[1.05456012, 1.37469192, 1.31642291, 1.09310462, 1.38681358],
[0.42832786, 1.20376832, 1.1588314 , 1.34796764, 0.92004354]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
• index: 4
• columns: 5
• 0.4624 0.835 1.419 0.9266 1.749 ... 0.4283 1.204 1.159 1.348 0.92
```array([[0.46244583, 0.83503292, 1.41862553, 0.92661634, 1.74944742],
[1.50557018, 0.8233929 , 0.89481389, 0.96159213, 1.18065456],
[1.05456012, 1.37469192, 1.31642291, 1.09310462, 1.38681358],
[0.42832786, 1.20376832, 1.1588314 , 1.34796764, 0.92004354]])```
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### argmax()¶

We can retrieve an index of the maximum element in the array using argmax() method.

Below we have retrieved the index of the maximum element of one of our arrays.

```max_index = arr1.argmax()

max_index
```
```<xarray.DataArray 'Array1' ()>
array(8)```
xarray.DataArray
'Array1'
• 8
`array(8)`

We can call item() method on an array with one element to access it.

We can use the same item() method with index to retrieve an element at that index value. Below we are retrieving the maximum element using item() method.

```arr1.item(max_index.item())
```
`0.9947187341846935`

The item() method can also accept a tuple of indices for arrays with more than one dimension to extract the individual element.

```arr1.item((0,0))
```
`0.5786850732755588`

As we had said earlier, the majority of array operations which we perform on a numpy array can be performed on xarray DataArray as well. But the major difference is that DataArray let us perform those operations based on dimension name and axis index both whereas numpy array let us perform an operation based only on-axis.

Below we have tried to get indices of maximum values across 'index' dimension of an array.

```max_indices = arr1.argmax(dim='index', skipna=True)

max_indices
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([2, 0, 0, 1, 1])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 2 0 0 1 1
`array([2, 0, 0, 1, 1])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### idxmax()¶

The idxmax() method works exactly like argmax() method with only difference that index values are returned as floats instead of integers.

```max_indices = arr1.idxmax(dim='index',skipna=True)

max_indices
```
```<xarray.DataArray 'index' (columns: 5)>
array([2., 0., 0., 1., 1.])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'index'
• columns: 5
• 2.0 0.0 0.0 1.0 1.0
`array([2., 0., 0., 1., 1.])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### argmin()¶

The argmin() method can be used to retrieve an index of minimum values.

Below we have retrieved indices of minimum values across 'columns' dimension.

There is idxmin() method as well which works exactly like this method.

```min_indices = arr1.argmin(dim='columns')

min_indices
```
```<xarray.DataArray 'Array1' (index: 4)>
array([0, 1, 1, 0])
Coordinates:
* index    (index) int64 0 1 2 3```
xarray.DataArray
'Array1'
• index: 4
• 0 1 1 0
`array([0, 1, 1, 0])`
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`

#### isnull()¶

The isnull() method detect Nan/None values in array. It returns an array of the same size as the original array with boolean values indicating the presence/absence of Nan/None values.

```arr1.isnull()
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• False False False False False False ... False False False False False
```array([[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### where()¶

The where() method lets us perform the conditional operation on an array. Its first argument is condition and the second argument is a value that should be taken in the case where the condition evaluates to False.

Below we have printed two of our earlier arrays as a reference as we'll be testing where() function on them.

```arr, arr2
```
```(<xarray.DataArray 'Array' (index: 4, columns: 5)>
array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'
Attributes:
index:      X-Dimension of Data
columns:    Y-Dimension of Data
info:       Pandas DataFrame
long_name:  Random Data
units:      Unknown,
<xarray.DataArray 'Array2' (index: 4, columns: 5)>
array([[0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767],
[0.73421379, 0.7067142 , 0.24650569, 0.2074986 , 0.41164924],
[0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385],
[0.36282616, 0.63435477, 0.56852942, 0.81083044, 0.46026918]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E')```

Below we have called where() method on arr array checking for a condition where the value of an array is greater than 0.5. Whenever value is greater than 0.5 take value from arr else take value from arr2.

```arr.where(arr > 0.5, arr2)
```
```<xarray.DataArray 'Array' (index: 4, columns: 5)>
array([[0.07511355, 0.60393655, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.7067142 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.36282616, 0.56941354, 0.59030199, 0.5371372 , 0.46026918]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'
Attributes:
index:      X-Dimension of Data
columns:    Y-Dimension of Data
info:       Pandas DataFrame
long_name:  Random Data
units:      Unknown```
xarray.DataArray
'Array'
• index: 4
• columns: 5
• 0.07511 0.6039 0.6696 0.6708 0.9583 ... 0.5694 0.5903 0.5371 0.4603
```array([[0.07511355, 0.60393655, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.7067142 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.36282616, 0.56941354, 0.59030199, 0.5371372 , 0.46026918]])```
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
• index :
X-Dimension of Data
columns :
Y-Dimension of Data
info :
Pandas DataFrame
long_name :
Random Data
units :
Unknown

Below we have explained the usage of where() method with another example.

```arr.where(arr2 > 0.5, arr)
```
```<xarray.DataArray 'Array' (index: 4, columns: 5)>
array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'
Attributes:
index:      X-Dimension of Data
columns:    Y-Dimension of Data
info:       Pandas DataFrame
long_name:  Random Data
units:      Unknown```
xarray.DataArray
'Array'
• index: 4
• columns: 5
• 0.3873 0.2311 0.6696 0.6708 0.9583 ... 0.5694 0.5903 0.5371 0.4598
```array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435]])```
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
• index :
X-Dimension of Data
columns :
Y-Dimension of Data
info :
Pandas DataFrame
long_name :
Random Data
units :
Unknown

#### dot()¶

We can perform the dot product of two arrays using dot() method. We can perform dot products based on dimension names as well.

Below we have performed dot product of two arrays based on dimension 'columns' present in both.

```xr.dot(arr, arr2, dims=["columns"])
```
```<xarray.DataArray (index: 4)>
array([1.59997017, 1.28164447, 1.81875229, 1.36772713])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'```
xarray.DataArray
• index: 4
• 1.6 1.282 1.819 1.368
`array([1.59997017, 1.28164447, 1.81875229, 1.36772713])`
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`

Below we have performed dot product of two array-based on dimension 'index' present in both.

```xr.dot(arr, arr2, dims=["index"])
```
```<xarray.DataArray (columns: 5)>
array([0.89677849, 1.05390316, 1.43014696, 0.92034748, 1.76691796])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
• columns: 5
• 0.8968 1.054 1.43 0.9203 1.767
`array([0.89677849, 1.05390316, 1.43014696, 0.92034748, 1.76691796])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have performed bot product without specifying any dimension name.

```xr.dot(arr,arr2)
```
```<xarray.DataArray ()>
array(6.06809405)```
xarray.DataArray
• 6.068
`array(6.06809405)`

#### drop()¶

The drop() method can be used to drop values in an array based on dimension and coordinates of dimension. It accepts two values as input. The first value is a list of coordinates and the second value is the dimension name, it then drops those values of dimension which has specified coordinates.

Below we have dropped values of 'index' dimension who has coordinate values [0,1].

```arr1.drop(labels=[0,1], dim="index")
```
```<xarray.DataArray 'Array1' (index: 2, columns: 5)>
array([[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 2
• columns: 5
• 0.9308 0.2424 0.8259 0.8199 0.6852 0.2427 0.6277 0.6619 0.4117 0.5048
```array([[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
2 3
`array([2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have created another example demonstrating the use of drop() method in cases where coordinate values are not integers.

```arr2.drop(labels=['0','1'], dim="index")
```
```<xarray.DataArray 'Array2' (index: 2, columns: 5)>
array([[0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385],
[0.36282616, 0.63435477, 0.56852942, 0.81083044, 0.46026918]])
Coordinates:
* index    (index) <U1 '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array2'
• index: 2
• columns: 5
• 0.5062 0.6452 0.6661 0.1698 0.6782 0.3628 0.6344 0.5685 0.8108 0.4603
```array([[0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385],
[0.36282616, 0.63435477, 0.56852942, 0.81083044, 0.46026918]])```
• index
(index)
<U1
'2' '3'
`array(['2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have created another example demonstrating the use of drop() method. This time we are dropping values across 'columns' dimension of our array.

```arr1.drop(labels=["D","E"], dim="columns")
```
```<xarray.DataArray 'Array1' (index: 4, columns: 3)>
array([[0.57868507, 0.78605464, 0.90389917],
[0.46849765, 0.07263884, 0.20157703],
[0.93084546, 0.24244413, 0.82591196],
[0.24271528, 0.62774479, 0.66185214]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C'```
xarray.DataArray
'Array1'
• index: 4
• columns: 3
• 0.5787 0.7861 0.9039 0.4685 0.07264 ... 0.8259 0.2427 0.6277 0.6619
```array([[0.57868507, 0.78605464, 0.90389917],
[0.46849765, 0.07263884, 0.20157703],
[0.93084546, 0.24244413, 0.82591196],
[0.24271528, 0.62774479, 0.66185214]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C'
`array(['A', 'B', 'C'], dtype='<U1')`

#### drop_isel()¶

The drop_isel() method works like the drop method but it let us specify coordinate values as integers instead of original coordinate values which can be of other data type as well.

The drop_isel() method works like isel() method and lets us specify coordinates of dimension either as a dictionary or as if they are parameters of the method.

Below we have dropped elements from the array whose coordinate value is 0 for dimension 'index'.

```arr1.drop_isel({"index":0})
```
```<xarray.DataArray 'Array1' (index: 3, columns: 5)>
array([[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 3
• columns: 5
• 0.4685 0.07264 0.2016 0.9947 0.9349 ... 0.6277 0.6619 0.4117 0.5048
```array([[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
1 2 3
`array([1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have dropped elements from the array whose coordinate values are 0 and 1 for dimension 'index'.

```arr1.drop_isel({"index":[0,1]})
```
```<xarray.DataArray 'Array1' (index: 2, columns: 5)>
array([[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 2
• columns: 5
• 0.9308 0.2424 0.8259 0.8199 0.6852 0.2427 0.6277 0.6619 0.4117 0.5048
```array([[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
2 3
`array([2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have created another example demonstrating the use of drop_isel() method to drop values across multiple dimensions of the array.

```arr1.drop_isel({"index":[0,1], "columns": [2,3,4]})
```
```<xarray.DataArray 'Array1' (index: 2, columns: 2)>
array([[0.93084546, 0.24244413],
[0.24271528, 0.62774479]])
Coordinates:
* index    (index) int64 2 3
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array1'
• index: 2
• columns: 2
• 0.9308 0.2424 0.2427 0.6277
```array([[0.93084546, 0.24244413],
[0.24271528, 0.62774479]])```
• index
(index)
int64
2 3
`array([2, 3])`
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

#### drop_sel()¶

The drop_sel() method works exactly like drop_isel() with only difference that it accepts original coordinate values of dimension instead of integer values.

Below we have dropped elements from the array whose coordinate values is [0,1] for dimension 'index' and ["C","D","E"] for dimension 'columns'.

```arr1.drop_sel({"index":[0,1], "columns": ["C","D","E"]})
```
```<xarray.DataArray 'Array1' (index: 2, columns: 2)>
array([[0.93084546, 0.24244413],
[0.24271528, 0.62774479]])
Coordinates:
* index    (index) int64 2 3
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array1'
• index: 2
• columns: 2
• 0.9308 0.2424 0.2427 0.6277
```array([[0.93084546, 0.24244413],
[0.24271528, 0.62774479]])```
• index
(index)
int64
2 3
`array([2, 3])`
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

Below we have created another example demonstrating the use of drop_sel() method across multiple dimensions.

```arr2.drop_sel({"index":['0','1'], "columns": ["C","D","E"]})
```
```<xarray.DataArray 'Array2' (index: 2, columns: 2)>
array([[0.50616351, 0.64518492],
[0.36282616, 0.63435477]])
Coordinates:
* index    (index) <U1 '2' '3'
* columns  (columns) <U1 'A' 'B'```
xarray.DataArray
'Array2'
• index: 2
• columns: 2
• 0.5062 0.6452 0.3628 0.6344
```array([[0.50616351, 0.64518492],
[0.36282616, 0.63435477]])```
• index
(index)
<U1
'2' '3'
`array(['2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B'
`array(['A', 'B'], dtype='<U1')`

#### copy()¶

We can call copy() method on xarray DataArray to create a copy of it. This will actually create a new array and any modification to this new array won't reflect in an original array from which it was copied because this new array is stored with its own memory.

```arr1_copy = arr1.copy()

arr1_copy
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595 ... 0.6277 0.6619 0.4117 0.5048
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### dropna(dim,how='any')¶

The dropna() method let us drop values across dimension of array. It accepts dimension name as the first parameter and method of drop as the second parameter to drop values. There are two different methods to drop values.

• 'any' - This is default method value. It'll drop entries of dimension where even a single value is Nan.
• 'all' - It'll drop entries of dimension where all entries are Nan.

Below we have set a few entries to Nan in our array which we created by copying one of our existing arrays.

```arr1_copy[0,3] = np.nan

arr1_copy[2,4] = np.nan

arr1_copy
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917,        nan, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938,        nan],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.9039 nan 0.595 ... 0.2427 0.6277 0.6619 0.4117 0.5048
```array([[0.57868507, 0.78605464, 0.90389917,        nan, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938,        nan],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have called dropna() method to drop values across 'index' dimension. It'll drop values where even a single value is Nan.

```arr1_copy.dropna(dim="index")
```
```<xarray.DataArray 'Array1' (index: 2, columns: 5)>
array([[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 1 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 2
• columns: 5
• 0.4685 0.07264 0.2016 0.9947 0.9349 0.2427 0.6277 0.6619 0.4117 0.5048
```array([[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
1 3
`array([1, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have called dropna() method to drop values across 'columns' dimension.

```arr1_copy.dropna(dim="columns")
```
```<xarray.DataArray 'Array1' (index: 4, columns: 3)>
array([[0.57868507, 0.78605464, 0.90389917],
[0.46849765, 0.07263884, 0.20157703],
[0.93084546, 0.24244413, 0.82591196],
[0.24271528, 0.62774479, 0.66185214]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C'```
xarray.DataArray
'Array1'
• index: 4
• columns: 3
• 0.5787 0.7861 0.9039 0.4685 0.07264 ... 0.8259 0.2427 0.6277 0.6619
```array([[0.57868507, 0.78605464, 0.90389917],
[0.46849765, 0.07263884, 0.20157703],
[0.93084546, 0.24244413, 0.82591196],
[0.24271528, 0.62774479, 0.66185214]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C'
`array(['A', 'B', 'C'], dtype='<U1')`

#### fillna(value)¶

We can use fillna() method to fill NaN values in the array. It accepts a single value as input which will be replaced in place of all NaNs.

```arr1_copy.fillna(value=9.99999)
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917, 9.99999   , 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 9.99999   ],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.9039 10.0 0.595 ... 0.2427 0.6277 0.6619 0.4117 0.5048
```array([[0.57868507, 0.78605464, 0.90389917, 9.99999   , 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 9.99999   ],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### drop_duplicates(dim)¶

The drop_duplicate() method let us drop duplicate values across dimension. We need to provide dimension names across which we want to drop duplicates.

Below we have first created a copy of one of our existing arrays and then we have copied one of the second axis data to another to create duplicate data. We can notice from the dataset printed below that the 1st and 3rd columns have the same data.

```arr1_copy = arr1.copy()

arr1_copy[:, 2] = arr1_copy[:, 0]

arr1_copy
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.57868507, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.46849765, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.93084546, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.24271528, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.5787 0.8501 0.595 ... 0.6277 0.2427 0.4117 0.5048
```array([[0.57868507, 0.78605464, 0.57868507, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.46849765, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.93084546, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.24271528, 0.41166893, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
```arr1_copy.drop_duplicates(dim='columns')
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.57868507, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.46849765, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.93084546, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.24271528, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.5787 0.8501 0.595 ... 0.6277 0.2427 0.4117 0.5048
```array([[0.57868507, 0.78605464, 0.57868507, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.46849765, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.93084546, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.24271528, 0.41166893, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### clip(min,max)¶

The clip() method let us restrict values of the array between the minimum and maximum values specified by us. It accepts two values as input where the first value is the minimum value and the second value is the maximum value. It then replaces all values in an array less than the minimum value with minimum value and all values greater than the maximum value with maximum value.

Below we have tried to restrict values of our array in the range [0.3,0.6] using clip() method.

```arr1.clip(min=0.3, max=0.6)
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.6       , 0.6       , 0.6       , 0.5950187 ],
[0.46849765, 0.3       , 0.3       , 0.6       , 0.6       ],
[0.6       , 0.3       , 0.6       , 0.6       , 0.6       ],
[0.3       , 0.6       , 0.6       , 0.41166893, 0.50476117]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.6 0.6 0.6 0.595 0.4685 0.3 ... 0.6 0.3 0.6 0.6 0.4117 0.5048
```array([[0.57868507, 0.6       , 0.6       , 0.6       , 0.5950187 ],
[0.46849765, 0.3       , 0.3       , 0.6       , 0.6       ],
[0.6       , 0.3       , 0.6       , 0.6       , 0.6       ],
[0.3       , 0.6       , 0.6       , 0.41166893, 0.50476117]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### contact(objs, dim)¶

We can combine arrays across dimensions using concat() method. It accepts a list of arrays as the first parameter and dimension name as the second parameter. It then combines two arrays across that dimension.

Below we have combined two arrays across 'index' dimension.

```xr.concat((arr,arr1), dim="index")
```
```<xarray.DataArray 'Array' (index: 8, columns: 5)>
array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435],
[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])
Coordinates:
* index    (index) object '0' '1' '2' '3' 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'
Attributes:
index:      X-Dimension of Data
columns:    Y-Dimension of Data
info:       Pandas DataFrame
long_name:  Random Data
units:      Unknown```
xarray.DataArray
'Array'
• index: 8
• columns: 5
• 0.3873 0.2311 0.6696 0.6708 0.9583 ... 0.6277 0.6619 0.4117 0.5048
```array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435],
[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.46849765, 0.07263884, 0.20157703, 0.99471873, 0.93488703],
[0.93084546, 0.24244413, 0.82591196, 0.81989938, 0.68520558],
[0.24271528, 0.62774479, 0.66185214, 0.41166893, 0.50476117]])```
• index
(index)
object
'0' '1' '2' '3' 0 1 2 3
`array(['0', '1', '2', '3', 0, 1, 2, 3], dtype=object)`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
• index :
X-Dimension of Data
columns :
Y-Dimension of Data
info :
Pandas DataFrame
long_name :
Random Data
units :
Unknown

Below we have combined two arrays across 'columns' dimension.

```xr.concat((arr,arr2), dim="columns")
```
```<xarray.DataArray 'Array' (index: 4, columns: 10)>
array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975,
0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532,
0.73421379, 0.7067142 , 0.24650569, 0.2074986 , 0.41164924],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973,
0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435,
0.36282616, 0.63435477, 0.56852942, 0.81083044, 0.46026918]])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E' 'A' 'B' 'C' 'D' 'E'
Attributes:
index:      X-Dimension of Data
columns:    Y-Dimension of Data
info:       Pandas DataFrame
long_name:  Random Data
units:      Unknown```
xarray.DataArray
'Array'
• index: 4
• columns: 10
• 0.3873 0.2311 0.6696 0.6708 0.9583 ... 0.6344 0.5685 0.8108 0.4603
```array([[0.38733228, 0.23109638, 0.66964265, 0.6708009 , 0.95829975,
0.07511355, 0.60393655, 0.74898288, 0.25581543, 0.79114767],
[0.7713564 , 0.1166787 , 0.6483082 , 0.75409353, 0.76900532,
0.73421379, 0.7067142 , 0.24650569, 0.2074986 , 0.41164924],
[0.54839661, 0.72950701, 0.65034097, 0.92334631, 0.70863973,
0.50616351, 0.64518492, 0.66608194, 0.16975831, 0.67817385],
[0.0655017 , 0.56941354, 0.59030199, 0.5371372 , 0.45977435,
0.36282616, 0.63435477, 0.56852942, 0.81083044, 0.46026918]])```
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' ... 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E', 'A', 'B', 'C', 'D', 'E'], dtype='<U1')`
• index :
X-Dimension of Data
columns :
Y-Dimension of Data
info :
Pandas DataFrame
long_name :
Random Data
units :
Unknown

#### round()¶

The round() method will round float values of an array.

```arr1.round()
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[1., 1., 1., 1., 1.],
[0., 0., 0., 1., 1.],
[1., 0., 1., 1., 1.],
[0., 1., 1., 0., 1.]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 ... 1.0 1.0 1.0 0.0 1.0 1.0 0.0 1.0
```array([[1., 1., 1., 1., 1.],
[0., 0., 0., 1., 1.],
[1., 0., 1., 1., 1.],
[0., 1., 1., 0., 1.]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

## 4. Simple Statistics ¶

In this section, we'll explain how we can perform simple statistics like sum, mean, variance, standard deviation, cumulative sum, cumulative product, etc.

#### sum(dim=None)¶

The sum() function can calculate sum across dimensions. If we don't provide dimension then it'll calculate the sum of all elements of the array.

Below we have first calculated the sum of all elements of the array. Then in the next cell, we have calculated the sum across 'index' dimension.

```arr1.sum()
```
```<xarray.DataArray 'Array1' ()>
array(12.33916272)```
xarray.DataArray
'Array1'
• 12.34
`array(12.33916272)`
```arr1.sum(dim="index")
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([2.22074346, 1.7288824 , 2.59324029, 3.07642409, 2.71987247])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 2.221 1.729 2.593 3.076 2.72
`array([2.22074346, 1.7288824 , 2.59324029, 3.07642409, 2.71987247])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### min(dim=None)¶

The min() function returns minimum values across dimensions.

Below we have first retrieved the minimum value of the whole array. Then in the next cell, we have retrieved minimum values across 'columns' dimension of the array.

```arr1.min()
```
```<xarray.DataArray 'Array1' ()>
array(0.07263884)```
xarray.DataArray
'Array1'
• 0.07264
`array(0.07263884)`
```arr1.min(dim="columns")
```
```<xarray.DataArray 'Array1' (index: 4)>
array([0.57868507, 0.07263884, 0.24244413, 0.24271528])
Coordinates:
* index    (index) int64 0 1 2 3```
xarray.DataArray
'Array1'
• index: 4
• 0.5787 0.07264 0.2424 0.2427
`array([0.57868507, 0.07263884, 0.24244413, 0.24271528])`
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`

#### max(dim=None)¶

The max() method works exactly like min() but returns maximum values instead.

```arr1.max()
```
```<xarray.DataArray 'Array1' ()>
array(0.99471873)```
xarray.DataArray
'Array1'
• 0.9947
`array(0.99471873)`
```arr1.max(dim="index")
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([0.93084546, 0.78605464, 0.90389917, 0.99471873, 0.93488703])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 0.9308 0.7861 0.9039 0.9947 0.9349
`array([0.93084546, 0.78605464, 0.90389917, 0.99471873, 0.93488703])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### std(dim=None)¶

The std() method helps us calculate a standard deviation across different dimensions of an array. Below we have explained the usage with simple examples.

```arr1.std()
```
```<xarray.DataArray 'Array1' ()>
array(0.26712418)```
xarray.DataArray
'Array1'
• 0.2671
`array(0.26712418)`
```arr1.std(dim="columns")
```
```<xarray.DataArray 'Array1' (index: 4)>
array([0.13275403, 0.37433162, 0.24211231, 0.15232193])
Coordinates:
* index    (index) int64 0 1 2 3```
xarray.DataArray
'Array1'
• index: 4
• 0.1328 0.3743 0.2421 0.1523
`array([0.13275403, 0.37433162, 0.24211231, 0.15232193])`
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`

#### var(dim=None)¶

The var() function helps us calculate variance across dimensions of array.

```arr1.var()
```
```<xarray.DataArray 'Array1' ()>
array(0.07135533)```
xarray.DataArray
'Array1'
• 0.07136
`array(0.07135533)`
```arr1.var(dim="index")
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([0.06170627, 0.0821856 , 0.0741555 , 0.04695209, 0.02573124])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 0.06171 0.08219 0.07416 0.04695 0.02573
`array([0.06170627, 0.0821856 , 0.0741555 , 0.04695209, 0.02573124])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### median(dim=None)¶

The median() function helps us find the median across different dimensions of the array.

```arr1.median()
```
```<xarray.DataArray 'Array1' ()>
array(0.64479846)```
xarray.DataArray
'Array1'
• 0.6448
`array(0.64479846)`
```arr1.median(dim="index")
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([0.52359136, 0.43509446, 0.74388205, 0.83501821, 0.64011214])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 0.5236 0.4351 0.7439 0.835 0.6401
`array([0.52359136, 0.43509446, 0.74388205, 0.83501821, 0.64011214])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### count(dim=None)¶

The count() function counts a number of elements across dimensions of the array.

```arr1.count()
```
```<xarray.DataArray 'Array1' ()>
array(20)```
xarray.DataArray
'Array1'
• 20
`array(20)`
```arr1.count(dim="index")
```
```<xarray.DataArray 'Array1' (columns: 5)>
array([4, 4, 4, 4, 4])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• columns: 5
• 4 4 4 4 4
`array([4, 4, 4, 4, 4])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### cumprod(dim=None)¶

The cumprod() function helps us calculate cumulative product across different dimensions of the array.

Below we have first calculated cumulative product across 'index' dimension of the array and then in the next cell, we have calculated cumulative product across 'columns' dimension of the array.

```arr1.cumprod(dim='index')
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.2711126 , 0.05709809, 0.18220531, 0.84564725, 0.55627526],
[0.25236393, 0.0138431 , 0.15048555, 0.69334565, 0.38116291],
[0.06125258, 0.00868993, 0.09959918, 0.28542886, 0.19239624]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595 ... 0.00869 0.0996 0.2854 0.1924
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[0.2711126 , 0.05709809, 0.18220531, 0.84564725, 0.55627526],
[0.25236393, 0.0138431 , 0.15048555, 0.69334565, 0.38116291],
[0.06125258, 0.00868993, 0.09959918, 0.28542886, 0.19239624]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
```arr1.cumprod(dim='columns')
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.45487809, 0.41116392, 0.34954568, 0.20798622],
[0.46849765, 0.03403112, 0.00685989, 0.00682366, 0.00637935],
[0.93084546, 0.22567802, 0.18639018, 0.15282119, 0.10471393],
[0.24271528, 0.15236325, 0.10084194, 0.04151349, 0.0209544 ]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.4549 0.4112 0.3495 0.208 ... 0.1524 0.1008 0.04151 0.02095
```array([[0.57868507, 0.45487809, 0.41116392, 0.34954568, 0.20798622],
[0.46849765, 0.03403112, 0.00685989, 0.00682366, 0.00637935],
[0.93084546, 0.22567802, 0.18639018, 0.15282119, 0.10471393],
[0.24271528, 0.15236325, 0.10084194, 0.04151349, 0.0209544 ]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### cumsum(dim=None)¶

The cumsum() function helps us find cumulative sum across different dimensions of the array and works exactly like cumprod() function.

```arr1.cumsum(dim='index')
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[1.04718272, 0.85869348, 1.1054762 , 1.84485578, 1.52990572],
[1.97802818, 1.10113761, 1.93138816, 2.66475516, 2.2151113 ],
[2.22074346, 1.7288824 , 2.59324029, 3.07642409, 2.71987247]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 0.7861 0.9039 0.8501 0.595 ... 2.221 1.729 2.593 3.076 2.72
```array([[0.57868507, 0.78605464, 0.90389917, 0.85013705, 0.5950187 ],
[1.04718272, 0.85869348, 1.1054762 , 1.84485578, 1.52990572],
[1.97802818, 1.10113761, 1.93138816, 2.66475516, 2.2151113 ],
[2.22074346, 1.7288824 , 2.59324029, 3.07642409, 2.71987247]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
```arr1.cumsum(dim='columns')
```
```<xarray.DataArray 'Array1' (index: 4, columns: 5)>
array([[0.57868507, 1.36473971, 2.26863888, 3.11877593, 3.71379463],
[0.46849765, 0.54113648, 0.74271352, 1.73743225, 2.67231928],
[0.93084546, 1.1732896 , 1.99920155, 2.81910093, 3.50430651],
[0.24271528, 0.87046007, 1.5323122 , 1.94398113, 2.44874231]])
Coordinates:
* index    (index) int64 0 1 2 3
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array1'
• index: 4
• columns: 5
• 0.5787 1.365 2.269 3.119 3.714 ... 0.2427 0.8705 1.532 1.944 2.449
```array([[0.57868507, 1.36473971, 2.26863888, 3.11877593, 3.71379463],
[0.46849765, 0.54113648, 0.74271352, 1.73743225, 2.67231928],
[0.93084546, 1.1732896 , 1.99920155, 2.81910093, 3.50430651],
[0.24271528, 0.87046007, 1.5323122 , 1.94398113, 2.44874231]])```
• index
(index)
int64
0 1 2 3
`array([0, 1, 2, 3])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### corr()¶

The corr() function helps us find the Pearson correlation coefficient across different dimensions of an array.

Below we have calculated the correlation between two arrays of the same dimensions. Then we have calculated correlation across 'index' dimension and 'columns' dimensions respectively. It'll take 1D arrays from 2D arrays based on dimensions and find out the correlation between them.

```xr.corr(arr, arr2)
```
```<xarray.DataArray ()>
array(0.0211588)```
xarray.DataArray
• 0.02116
`array(0.0211588)`
```xr.corr(arr, arr2, dim="index")
```
```<xarray.DataArray (columns: 5)>
array([ 0.62264745, -0.33522625,  0.16205577, -0.8278302 ,  0.64970906])
Coordinates:
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
• columns: 5
• 0.6226 -0.3352 0.1621 -0.8278 0.6497
`array([ 0.62264745, -0.33522625,  0.16205577, -0.8278302 ,  0.64970906])`
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
```xr.corr(arr, arr2, dim="columns")
```
```<xarray.DataArray (index: 4)>
array([ 0.43851828, -0.46300051, -0.67088344,  0.71856875])
Coordinates:
* index    (index) <U1 '0' '1' '2' '3'```
xarray.DataArray
• index: 4
• 0.4385 -0.463 -0.6709 0.7186
`array([ 0.43851828, -0.46300051, -0.67088344,  0.71856875])`
• index
(index)
<U1
'0' '1' '2' '3'
`array(['0', '1', '2', '3'], dtype='<U1')`

#### rolling()¶

The rolling() method let us perform rolling window functions on xarray DataArray objects. It accepts the dimension at which to apply the rolling window function and window size as input. We can provide dimension name and window size as a dictionary or as if they are parameters of methods as well. After applying the rolling window function, we can calculate various aggregate functions like mean, standard deviation, sum, variance, etc on rolled windows of data.

Below we have performed the rolling window function on our array at 'index' dimension with a window size of 2. We have then taken the average of windows.

If you want to know how to perform moving window functions in pandas then please feel free to check our tutorial on the same where we cover the topic in detail.

```rolling_mean = arr3.rolling({"index": 2}).mean()

rolling_mean
```
```<xarray.DataArray 'Array3' (index: 4, columns: 5)>
array([[       nan,        nan,        nan,        nan,        nan],
[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.57932296, 0.55677522, 0.03106052, 0.58404636, 0.5049073 ],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-02 2021-01-03 2021-01-04
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array3'
• index: 4
• columns: 5
• nan nan nan nan nan 0.3057 ... 0.4739 0.3936 0.366 0.6864 0.6281
```array([[       nan,        nan,        nan,        nan,        nan],
[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.57932296, 0.55677522, 0.03106052, 0.58404636, 0.5049073 ],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])```
• index
(index)
datetime64[ns]
2021-01-01 ... 2021-01-04
```array(['2021-01-01T00:00:00.000000000', '2021-01-02T00:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-04T00:00:00.000000000'],
dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have created exactly the same example as our previous cell but by providing dimension name and window size as parameters of the method.

```rolling_mean = arr3.rolling(index=2).mean()

rolling_mean
```
```<xarray.DataArray 'Array3' (index: 4, columns: 5)>
array([[       nan,        nan,        nan,        nan,        nan],
[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.57932296, 0.55677522, 0.03106052, 0.58404636, 0.5049073 ],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-02 2021-01-03 2021-01-04
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array3'
• index: 4
• columns: 5
• nan nan nan nan nan 0.3057 ... 0.4739 0.3936 0.366 0.6864 0.6281
```array([[       nan,        nan,        nan,        nan,        nan],
[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.57932296, 0.55677522, 0.03106052, 0.58404636, 0.5049073 ],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])```
• index
(index)
datetime64[ns]
2021-01-01 ... 2021-01-04
```array(['2021-01-01T00:00:00.000000000', '2021-01-02T00:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-04T00:00:00.000000000'],
dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have created another example where we are performing a rolling window function on our array at 'columns' dimension with a window size of 3. We have then taken standard deviation on data windows.

```rolling_mean = arr3.rolling({"columns": 3}).std()

rolling_mean
```
```<xarray.DataArray 'Array3' (index: 4, columns: 5)>
array([[       nan,        nan, 0.23202869, 0.41078754, 0.38733677],
[       nan,        nan, 0.38156689, 0.37894448, 0.29049268],
[       nan,        nan, 0.38635193, 0.18272065, 0.34157819],
[       nan,        nan, 0.29524203, 0.12527674, 0.21038439]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-02 2021-01-03 2021-01-04
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array3'
• index: 4
• columns: 5
• nan nan 0.232 0.4108 0.3873 nan ... nan nan 0.2952 0.1253 0.2104
```array([[       nan,        nan, 0.23202869, 0.41078754, 0.38733677],
[       nan,        nan, 0.38156689, 0.37894448, 0.29049268],
[       nan,        nan, 0.38635193, 0.18272065, 0.34157819],
[       nan,        nan, 0.29524203, 0.12527674, 0.21038439]])```
• index
(index)
datetime64[ns]
2021-01-01 ... 2021-01-04
```array(['2021-01-01T00:00:00.000000000', '2021-01-02T00:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-04T00:00:00.000000000'],
dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

#### resample()¶

The resample() function is useful in situations where the dimension is datetime and we want to resample it at a different frequency than the current one. The resampling can be of two types.

1. Up Sampling - We increase sample frequency from lesser to higher. E.g. - daily frequency to monthly.
2. Down Sampling - We decrease sample frequency. E.g. - daily to 6 hourly

The resample() function takes as input dimension name and new frequency as input to resample xarray DataArra. We can provide dimension and frequency either as a dictionary or as if they are parameters of the method.

If you are interested in learning about resampling using pandas then please feel free to check our tutorial where we discuss resampling in detail.

##### Up Sampling¶

Below we have taken one of our arrays which had 'index' dimension with datetime coordinates, we have then resampled the array to 2 days frequency to daily frequency. We have upsampled array. After upsampling, we have called mean() function to replace values in the new array as an average of values.

```two_day_sampled = arr3.resample({"index": "2D"})

two_day_sampled
```
```DataArrayResample, grouped over '__resample_dim__'
2 groups with labels 2021-01-01, 2021-01-03.```
```for dt, darray in two_day_sampled:
print(dt, darray.shape, darray.dims)
```
```2021-01-01T00:00:00.000000000 (2, 5) ('index', 'columns')
2021-01-03T00:00:00.000000000 (2, 5) ('index', 'columns')
```
```two_day_sampled_mean = two_day_sampled.mean()

two_day_sampled_mean
```
```<xarray.DataArray 'Array3' (index: 2, columns: 5)>
array([[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-03
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array3'
• index: 2
• columns: 5
• 0.3057 0.8477 0.4758 0.3397 0.2317 0.4739 0.3936 0.366 0.6864 0.6281
```array([[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])```
• index
(index)
datetime64[ns]
2021-01-01 2021-01-03
```array(['2021-01-01T00:00:00.000000000', '2021-01-03T00:00:00.000000000'],
dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`

Below we have recreated our previous example by providing dimension name and frequency as parameters.

```two_day_sampled_mean = arr3.resample(index="2D").mean()

two_day_sampled_mean
```
```<xarray.DataArray 'Array3' (index: 2, columns: 5)>
array([[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 2021-01-03
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array3'
• index: 2
• columns: 5
• 0.3057 0.8477 0.4758 0.3397 0.2317 0.4739 0.3936 0.366 0.6864 0.6281
```array([[0.30568927, 0.84770355, 0.47577914, 0.33966321, 0.23168078],
[0.47388237, 0.3936182 , 0.36601416, 0.6863906 , 0.62810675]])```
• index
(index)
datetime64[ns]
2021-01-01 2021-01-03
```array(['2021-01-01T00:00:00.000000000', '2021-01-03T00:00:00.000000000'],
dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
##### Down Sampling¶

In this section, we have downsampled our DataArray from daily frequency to 12 hourly frequency. As we have downsampled dataset, it'll introduce many new entries and will also introduce NaNs in the dataset in places we don't have data. The reason behind NaNs is that we have introduced new entries in the dataset which were not present earlier by downsampling. Our data has entry only for 1 day and not every 12 hours. We can fill NaNs by calling some xarray functions like ffill(), bfill(), fillna(), etc.

After downsampling, we have taken an average of resampled entries. We have also displayed 'index' dimension data for verification purposes.

```twelve_hour_sampled_mean = arr3.resample({"index": "12H"}).mean()

twelve_hour_sampled_mean
```
```<xarray.DataArray 'Array3' (index: 7, columns: 5)>
array([[0.39792208, 0.79787484, 0.94760726, 0.01103115, 0.34796905],
[       nan,        nan,        nan,        nan,        nan],
[0.21345645, 0.89753226, 0.00395103, 0.66829528, 0.11539251],
[       nan,        nan,        nan,        nan,        nan],
[0.94518946, 0.21601817, 0.05817   , 0.49979745, 0.89442209],
[       nan,        nan,        nan,        nan,        nan],
[0.00257528, 0.57121823, 0.67385832, 0.87298376, 0.36179141]])
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 ... 2021-01-04
* columns  (columns) <U1 'A' 'B' 'C' 'D' 'E'```
xarray.DataArray
'Array3'
• index: 7
• columns: 5
• 0.3979 0.7979 0.9476 0.01103 0.348 ... 0.5712 0.6739 0.873 0.3618
```array([[0.39792208, 0.79787484, 0.94760726, 0.01103115, 0.34796905],
[       nan,        nan,        nan,        nan,        nan],
[0.21345645, 0.89753226, 0.00395103, 0.66829528, 0.11539251],
[       nan,        nan,        nan,        nan,        nan],
[0.94518946, 0.21601817, 0.05817   , 0.49979745, 0.89442209],
[       nan,        nan,        nan,        nan,        nan],
[0.00257528, 0.57121823, 0.67385832, 0.87298376, 0.36179141]])```
• index
(index)
datetime64[ns]
2021-01-01 ... 2021-01-04
```array(['2021-01-01T00:00:00.000000000', '2021-01-01T12:00:00.000000000',
'2021-01-02T00:00:00.000000000', '2021-01-02T12:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-03T12:00:00.000000000',
'2021-01-04T00:00:00.000000000'], dtype='datetime64[ns]')```
• columns
(columns)
<U1
'A' 'B' 'C' 'D' 'E'
`array(['A', 'B', 'C', 'D', 'E'], dtype='<U1')`
```twelve_hour_sampled_mean["index"]
```
```<xarray.DataArray 'index' (index: 7)>
array(['2021-01-01T00:00:00.000000000', '2021-01-01T12:00:00.000000000',
'2021-01-02T00:00:00.000000000', '2021-01-02T12:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-03T12:00:00.000000000',
'2021-01-04T00:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
* index    (index) datetime64[ns] 2021-01-01 ... 2021-01-04```
xarray.DataArray
'index'
• index: 7
• 2021-01-01 2021-01-01T12:00:00 ... 2021-01-03T12:00:00 2021-01-04
```array(['2021-01-01T00:00:00.000000000', '2021-01-01T12:00:00.000000000',
'2021-01-02T00:00:00.000000000', '2021-01-02T12:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-03T12:00:00.000000000',
'2021-01-04T00:00:00.000000000'], dtype='datetime64[ns]')```
• index
(index)
datetime64[ns]
2021-01-01 ... 2021-01-04
```array(['2021-01-01T00:00:00.000000000', '2021-01-01T12:00:00.000000000',
'2021-01-02T00:00:00.000000000', '2021-01-02T12:00:00.000000000',
'2021-01-03T00:00:00.000000000', '2021-01-03T12:00:00.000000000',
'2021-01-04T00:00:00.000000000'], dtype='datetime64[ns]')```

This ends our small tutorial explaining the DataArray data structure of xarray to hold and manipulate data. Please feel free to let us know your views in the comments section.

## Reference¶

Sunny Solanki

## Comfortable Learning through Video Tutorials?

If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.

## Stuck Somewhere? Need Help with Coding? Have Doubts About the Topic/Code?

When going through coding examples, it's quite common to have doubts and errors.

If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.

You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.

## Want to Share Your Views? Have Any Suggestions?

If you want to

• provide some suggestions on topic
• share your views
• include some details in tutorial
• suggest some new topics on which we should create tutorials/blogs
Please feel free to contact us at coderzcolumn07@gmail.com. We appreciate and value your feedbacks. You can also support us with a small contribution by clicking DONATE.