In this lesson we will define the package that will contain the active conference. For this package we will use the ViewModel paradigm. Due to this paradigm we will separate the data from the visualization logic. Thanks to this split the conference will survive to configuration changes. One of these changes could be when you rotate your device and the layout changes from portrait to landscape.

In the layout we will have a couple of SurfaceViewRenderers . These special views will have the video streams. You can think of them as the video tags of HTML language. In this case, we will need two of them. The first one will be used as mirrored camera feedback. The other will be for the remote videos with all other participants.

During this tutorial we will perform the following tasks:

  • Modify the navigation for allowing to go from the FormFragment to the ConferenceFragment .

  • Modify the layout for the FormFragment and add there all the input fields and the button to start the conference.

  • Modify the fragment called FormFragment that will have a click listener that will navigate to the ConferenceFragment .

  • Modify the ViewModel called ConferenceViewModel that will manage all the conference logic and state.

  • Modify the layout for the ConferenceFragment . This layout will have two regions for the videos (local and remote) and another region for the toolbar to perform actions.

  • Modify the fragment called ConferenceFragment . This is the final step and here you have to verify that the user granted the camera and microphone permissions. Also, it will act as an UI controller for the layout.

You can download the starter code from Step.1-Exercise-Join-conference.

Modify the navigation

At this point we can only navigate to the form fragment, which is not very fancy. In this section we will configure the navigation behavior.

The first step is to modify the FormFragment and add an action to this fragment. This action will be used to navigate to the ConferenceFragment . To improve the UX of these transitions we will attach to the action some animations.

app:popUpTo="@id/formFragment" />

Now we have to define the ConferenceFragment that will receive three arguments: node, vmr and display_name.

app:argType="string" />
app:argType="string" />
app:argType="string" />

The value of the label for the FormFragment is the app name. This value will be used as the ActionBar title. For the ConferenceFragment this value is empty. The reason for this election is that it will be dynamic. We will define it later and use the Virtual Meeting Room name as the title.

Modify the form package

This package will contain the FormFragment which will be the entry-point of our application. Inside, we will find the FormFragment.kt file that will inflate the fragment_form.xml layout.

At this point, the fragment only displays a text saying "Hello Form!". You will replace this view for the form and implement a function that will navigate to the ConferenceFragment when the user push a button.

Modify the form layout

The first step is to define the layout that we want use. In this case we will arrange all the elements in a vertical stack, so the first step is to define a LinearLayout with a vertical orientation.



Inside the LinearLayout you need to define all the form views. The first one to define will be the one to obtain the IP or domain of the Conferencing Node.


Now you have to define another one to obtain the conference name or VMR (Virtual Meeting Room).


The last TextInput will be to define the user display name.


And finally the button to send the form.

android:textStyle="bold" />

Modify the form fragment

The first step in this fragment is to change the way that the view is inflated. In this case, we will start using data binding. Go to onCreateView() and define the click listener that is called each time the user push the join button.

In this lambda function we will obtain the values of all edit text views and pass them to the ConferenceFragment . The last instruction is to navigate to the new fragment.

override fun onCreateView(
inflater: LayoutInflater, container: ViewGroup?,
savedInstanceState: Bundle?
): View? {
// Inflate the layout for this fragment
val binding = FragmentFormBinding.inflate(inflater, container, false)
binding.joinButton.setOnClickListener {

val node = binding.nodeText.editText?.text.toString()
val vmr = binding.vmrText.editText?.text.toString()
val displayName = binding.displayNameText.editText?.text.toString()

val action = FormFragmentDirections.actionFormFragmentToConferenceFragment(
node, vmr, displayName
return binding.root

Modify the conference package

In the previous section you have defined the fragment for gathering all the conference settings. Now it's time to define a fragment that will contain the conference itself.

Modify the conference ViewModel

For improving the architecture, you will define all the logic in a ViewModel . This way the conference will survive even if the fragment is destroyed due to a configuration change.

Start by defining the eglBase that stores the EGL state and utility methods that we will use in several places of our application.

// Initialize EGL
val eglBase: EglBase = EglBase.create()

Now we need define the variables to store the local and remote tracks. The localVideoTrack and remoteVideoTrack are LiveData . This way, the fragment will be aware of any change in these values, so it can render them in the layout.

// AudioTrack from the local microphone
private lateinit var localAudioTrack: LocalAudioTrack

// Local VideoTrack
private val _localVideoTrack = MutableLiveData<CameraVideoTrack>()
val localVideoTrack: LiveData<CameraVideoTrack>
get() = _localVideoTrack

// Remote VideoTrack
private val _remoteVideoTrack = MutableLiveData<VideoTrack>()
val remoteVideoTrack: LiveData<VideoTrack>
get() = _remoteVideoTrack

Define another two LiveData variables for detecting when the conference is connected and when there is any error on it.

// Notify if the user is connected to the conference or not
private val _isConnected = MutableLiveData<Boolean>()
val isConnected: LiveData<Boolean>
get() = _isConnected

// Used to inform of an error to the fragment
private val _onError = MutableLiveData<Throwable>()
val onError: LiveData<Throwable>
get() = _onError

We need to save the webRtcMediaConnectionFactory, since we will need it to perform changes in your connection. You also need to save the conference and mediaConnection to dispose both after the conference ends.

// Objects needed to initialize the conference
private val webRtcMediaConnectionFactory: WebRtcMediaConnectionFactory

// Objects that save the conference state
private lateinit var conference: InfinityConference
private lateinit var mediaConnection: MediaConnection

The next step is to define the constructor. This constructor is very simple, since it only initializes the webRtcMediaConnectionFactory .

The webRtcMediaConnectionFactory object will be key in our application. It will be used for creating the media connection, obtain the access to the microphone, camera or even the screen.

init {
// Create the webRtcMediaConnectionFactory
webRtcMediaConnectionFactory = WebRtcMediaConnectionFactory(
context = application,
eglBase = eglBase

Now you have to define what should happen when the ViewModel is destroyed. Since the ViewModel owner is the ConferenceFragment , this piece of code will be run when the user navigates to another fragment or close the application. When this happens, you need to leave the current conference, and dispose the media connection and local tracks.

override fun onCleared() {
if (this::conference.isInitialized) {
if (this::mediaConnection.isInitialized) {
if (this::localAudioTrack.isInitialized) {

Our main public method will be startConference() . This method will receive the same info that the ConferenceFragment does and it will try to join the conference.

As this process takes some time, the method will launch a coroutine. Once the process is complete, two things can happen: isConnected will change to true or it will change to false and onError will have the value of the exception.

fun startConference(node: String, vmr: String, displayName: String) {
val exceptionHandler = CoroutineExceptionHandler {_, exception->
// Convert the error into a more descriptive message
viewModelScope.launch(exceptionHandler) {
// Authenticate to the conference
conference = createConference(node, vmr, displayName)

// Get access to the local microphone and camera
val (audioTrack, videoTrack) = getLocalMedia()
localAudioTrack = audioTrack
_localVideoTrack.value = videoTrack

// Initialize the WebRTC media connection. We will sending and receiving media.
startWebRTCConnection(conference, audioTrack, videoTrack)

Now you have to define three new private methods: createConference() , getLocalMedia() and startWebRtcConnection() .

Let's start by implementing createConference() . Notice that the method is defined as suspend . This means that it will be asynchronous and must be launched inside a coroutine.

If the conference is valid and the user has access to it, the method will return a InfinityConference object.

private suspend fun createConference(
node: String,
vmr: String,
displayName: String
): InfinityConference {

val okHttpClient = OkHttpClient()
val request = RequestTokenRequest(displayName = displayName)
val infinityService = InfinityService.create(okHttpClient)
lateinit var conference: InfinityConference
val nodeUrl = URL("https://${node}")

return withContext(Dispatchers.IO) {
val response = infinityService.newRequest(nodeUrl)
conference = InfinityConference.create(
service = infinityService,
node = nodeUrl,
conferenceAlias = displayName,
response = response
return@withContext conference

With Pexip SDK you can listener to conferences events. In this case, we will detect the DisconnectConferenceEvent and, in case we detect this event, we must change the value of the isConnected to false . This way, the app will leave the ConferenceFragment and come back to the FormFragment .

private fun configureConferenceListeners(conference: InfinityConference) {
conference.registerConferenceEventListener(ConferenceEventListener { event ->
when (event) {
is DisconnectConferenceEvent -> {
else -> {
Log.d("ConferenceViewModel", event.toString())

Now is the turn of obtaining access to the camera and microphone. After granted access, it starts capturing the videoTrack and audioTrack and returns a reference to both of them.

private fun getLocalMedia(): Pair<LocalAudioTrack, CameraVideoTrack> {
val audioTrack: LocalAudioTrack = webRtcMediaConnectionFactory.createLocalAudioTrack()
val videoTrack: CameraVideoTrack = webRtcMediaConnectionFactory.createCameraVideoTrack()
return audioTrack to videoTrack

The last step is to start the media connection and obtain the removeVideoTrack . The media connection uses WebRTC under the hood and, before starting it, you have to define its configuration.

The key parameter is the iceServer, which defines the STUN and TURN servers. These servers are used to perform the Interactive Connectivity Establishment (ICE), which is a method for discovering the best path between two endpoints that could be behind NAT.

private fun startWebRTCConnection(
conference: InfinityConference,
localAudioTrack: LocalAudioTrack,
localVideoTrack: CameraVideoTrack
) {
// Define the STUN server. This is used for obtain the public IP of the participants
// and this way be able to establish the media connection.
val iceServer = IceServer.Builder("").build()
val config = MediaConnectionConfig.Builder(conference)

// Save the media connection in a class private variable. We need it later
// for disposing the media connection.
mediaConnection = webRtcMediaConnectionFactory.createMediaConnection(config)

// Attach the local media streams to the media connection.

// Define a callback method for when the remote video is received.
val mainRemoveVideTrackListener = MediaConnection.RemoteVideoTrackListener { videoTrack ->
// We have to use postValue instead of value, because we are running this in another thread.

// Attach the callback to the media connection.

// Start the media connection.

The final step is to define a method that will be triggered when the user click on the hang up button. In this case, the method only changes the value of the LiveData . The ConfereceFragment will detect this change and it will navigate to the FormFragment .

fun onDisconnect() {
_isConnected.value = false

Modify the conference layout

Now it's time to define what the user should see during a conference and where each element will be located.

In the layout we need to remove the TextView and add the following elements inside a ConstraintLayout:

  • ProgressBar : This is a loading icon that the system will display while the conference is starting.

  • ConstraintLayout : This view will be only visible when the user is connected to the conference.

app:layout_constraintTop_toTopOf="parent" />



In the more internal ConstraintLayout we need to add the following items:

  • SurfaceViewRenderer : This region is where the remote video stream will be rendered. It will fill the whole layout and the rest of the views will be on top of this.

  • CardView and SurfaceViewRenderer : This view will be the container for our local video. With the CardView we will get a better UI, since it will add a border with a shadow.

app:layout_constraintTop_toTopOf="parent" />


android:layout_gravity="center" />

Under the last CardView define a LinearLayout and inside you will define the hang up button.


android:onClick="@{() -> viewModel.onDisconnect()}"
app:iconSize="40dp" />

In the next lessons we will add some more buttons to this layout to perform other actions.

Modify the conference fragment

Let's start by defining a variable that will store the data binding and ViewModel. We will need access to both of them in several sections of the code.

private lateinit var binding: FragmentConferenceBinding
private lateinit var viewModel: ConferenceViewModel

As in all the fragments, the main method is onCreateView() . In this case we start by obtaining the SafeArgs that contains the node, vmr and display name. Then we obtain the application object and a reference to the ViewModel. Then, we need initialize all the elements in our fragment and the observers for the LiveData .

The last step is to check if the user is already in a conference and, if not, check the media permissions and start a conference.

override fun onCreateView(
inflater: LayoutInflater, container: ViewGroup?,
savedInstanceState: Bundle?
): View? {

// This variable has the node, vmr and displayName
val args by navArgs<ConferenceFragmentArgs>()

// Change the Action Bar title and put the Virtual Meeting Room
(activity as AppCompatActivity).supportActionBar?.title = args.vmr

// Inflate the layout for this fragment
binding = FragmentConferenceBinding.inflate(

// Create an instance of the viewModel and attach it to the data binding
val application = requireNotNull(this.activity).application
val viewModelFactory = ConferenceViewModelFactory(application)
viewModel = ViewModelProvider(this, viewModelFactory)[]
binding.viewModel = viewModel

// Assign this fragment as lifecycle owner
binding.lifecycleOwner = this

// Initialize the containers for the videoTracks

// Set all observers
setConnectionObservers(args.node, args.vmr, args.displayName)

if (viewModel.isConnected.value != true) {
// Check the media permissions or show a pop-up to accept them
checkMediaPermissions() {
// Callback once the permission was correctly checked
viewModel.startConference(args.node, args.vmr, args.displayName)


return binding.root


Now we must define what should happen once the fragment is destroyed. In this case, we need to free all the resources related with the SurfaceViewRenderers.

override fun onDestroyView() {

Now we will define what should happen when another activity comes into the foreground and our activity is not longer visible. In this case, we will leave the conference active, but stop capturing the local camera. This is a way to protect the user privacy.

However, in this case we will leave the microphone active, so the user can continue the conversation even if he changes to another app.

override fun onStop() {

Once the user comes back to the application you should start capturing the camera again.

override fun onStart() {

Now we need to define the private methods that are called from onCreateView() .

Let's start by initializeVideoSurfaces() . This method initialize a special types of views called SurfaceViewRenderers . These views are the elements where the videoTracks are rendered.

This method will do two special tasks. First, it will flip the local video. This way, the user will see his own image mirrored. The other task is to scale the remote video. With this modification, the user will see the whole remote video. If we don't use it, the app with crop the left and right sections of the video.

private fun initializeVideoSurfaces() {
// Mirror the local video

// Show all the video inside the container

// Initialize the video surfaces
binding.localVideoSurface.init(viewModel.eglBase.eglBaseContext, null)
binding.mainVideoSurface.init(viewModel.eglBase.eglBaseContext, null)

Now let's continue with setVideoTracksObservers() . This method will observe changes in the local and remote video tracks and add the renderer in case a new track is detected.

private fun setVideoObservers() {
// Initialize observer to attach the VideoTrack to the surface renderers
viewModel.localVideoTrack.observe(viewLifecycleOwner, Observer { videoTrack ->
viewModel.remoteVideoTrack.observe(viewLifecycleOwner, Observer { videoTrack ->

We must do the same for isConnected and onError . If isConnected change to false the app will come back to the FormFrament . If onError is triggered, it will display a snackbar message with the error and also come back to the FormFragment.

private fun setConnectionObservers(node: String, vmr: String, displayName: String) {
// Initialize observer to display connectivity changes
viewModel.isConnected.observe(viewLifecycleOwner, Observer { isConnected ->
if (!isConnected) {
// The conference finished

// Error detected. Display a Snackbar with it.
viewModel.onError.observe(viewLifecycleOwner, Observer { exception ->
val error = when (exception) {
is NoSuchConferenceException -> {
resources.getString(R.string.conference_not_found, vmr)
else -> {
resources.getString(R.string.cannot_connect, node)
val parentView = requireActivity().findViewById<View>(
Snackbar.make(parentView, error, Snackbar.LENGTH_LONG).show()

Finally, you need to define a method that will request access to the camera and microphone. If there is any error in the request, the app will go back to the FormFragment and display an error message in a snackbar.

private fun checkMediaPermissions(callback: () -> Unit) {
val requestMultiplePermissions =
registerForActivityResult(ActivityResultContracts.RequestMultiplePermissions()) { permissions ->
if (!permissions.entries.all { it.value }) {
val parentView = requireActivity().findViewById<View>(
Snackbar.make(parentView, R.string.grant_media_permissions, Snackbar.LENGTH_LONG).show()
} else {

Run the app

You have finished the first tutorial and the videoconferencing app is ready to use. Launch it in your device and try to join to a VMR that you should have previously created. Take into account that at this moment the application doesn't support VMR with PINs. You will add the PIN support in the next tutorial.

You can compare your code with the solution in Step.1-Solution-Join-a-conference. You can also check the differences with the previous tutorial in the git diff .
