Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teghan/build one dockerfile #2

Open
wants to merge 76 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
fb9267b
Fix UpdateRegisteredModel and UpdateModelVersion to honor path variab…
kriscon-db Oct 2, 2024
efc1db0
Add owner, createdBy, updatedBy to all securables and regen openapi d…
kriscon-db Sep 12, 2024
708968b
ci: first pass at multi arch build
tnightengale Sep 13, 2024
840c1e6
ci: fix tab
tnightengale Sep 13, 2024
51fa7ba
ci: fix docker action versions
tnightengale Sep 13, 2024
722ea5f
ci: rm QEMU, add registry caching, enable push
tnightengale Sep 13, 2024
098a11e
ci: QEMU again, build-push-action@v6
tnightengale Sep 13, 2024
15002a1
ci: amd64 only
tnightengale Sep 13, 2024
be4b612
Run `javafmtAll` on main repo (#479)
vikrantpuppala Sep 13, 2024
b9f0d2e
ci: follow tagging best practices
tnightengale Sep 13, 2024
ffc0a16
ci: no push until ready to publish
tnightengale Sep 13, 2024
085d116
wip: tarball jars
tnightengale Sep 15, 2024
1044b5f
build: simplify into one Dockerfile
tnightengale Sep 16, 2024
69bfe53
build: add psql and mysql jdbc driver jars to image
tnightengale Sep 16, 2024
390462f
Enable identity logging on create/update paths (#483)
kriscon-db Sep 13, 2024
cbcfb2f
Specify release settings for spark connector (#484)
bijanhoule Sep 13, 2024
1c6a30f
[SPARK] Test parquet table creation by Spark 3.5.3 (#357)
amaliujia Sep 14, 2024
0dfb470
build: finished ui and main Dockerfiles
tnightengale Sep 18, 2024
0257414
feat: hibernate properties use env vars
tnightengale Sep 18, 2024
3cd58cf
ci: adds compose.yaml
tnightengale Sep 18, 2024
8bad147
Build/release fixes (#489)
bijanhoule Sep 16, 2024
4c76dec
UI READ support for models and versions (#460)
jamieknight-db Sep 17, 2024
2878651
Add temporary path credentials api (#478)
vikrantpuppala Sep 17, 2024
9a87257
Fixed UC Cli NullPointerException for invalid entity types (#445)
yaoderek Sep 17, 2024
1c1d0ec
Update UI Readme (#459)
JuanPabloDiaz Sep 17, 2024
ae8914f
Mark spark-sql dependency as Provided (#490)
bijanhoule Sep 17, 2024
ed5e196
add new acl fields to all entities (#491)
jamieknight-db Sep 17, 2024
500af83
Added a python script explaining how to integration test mlflow with …
kriscon-db Sep 18, 2024
557d230
Fix feature warning by adding import clause in UCSingleCatalog.scala …
vksx Sep 18, 2024
0e11fb4
docs: Docker Quickstart
tnightengale Sep 18, 2024
9e19e23
docs: add external db to README
tnightengale Sep 18, 2024
427cb31
ci: add multi-arch builds to docker github action
tnightengale Sep 18, 2024
68bb9a8
build: use amazoncorretto as alpine base, runtime image is 1.35GB
tnightengale Sep 18, 2024
edd5edf
ci: add caching and pushing for workflow, if enabled
tnightengale Sep 18, 2024
4aee7dc
fix: fmt docker-build.yml
tnightengale Sep 18, 2024
ee73581
build: hold on MySQL jdbc connector
tnightengale Sep 18, 2024
60dcdd2
ci: add buildx fix
tnightengale Sep 18, 2024
fdcce06
ci: rm dockerhub steps
tnightengale Sep 19, 2024
2d90866
build: exclude LGPL 2.1 deps from uk.org.webcomprere
tnightengale Sep 19, 2024
abb8297
build: whitelist LGPL 2.1 spotbugs
tnightengale Sep 19, 2024
2a3017c
fix: rm bash HOME in Dockerfile, use DB_CLOSE in default hibernate co…
tnightengale Sep 19, 2024
223f4b7
style: Dockerfile args order
tnightengale Sep 19, 2024
052e914
docs: add docs for ORC tables (#388)
avriiil Sep 20, 2024
f1bdcba
Docs: Managed vs External tables (#471)
avriiil Sep 20, 2024
701c635
UC should work without Delta Lake (#467)
cloud-fan Sep 21, 2024
fd832d9
docs: fix Daft integration guide (#498)
peter-gy Sep 21, 2024
7035708
Added initial documentation for models (#447)
kriscon-db Sep 23, 2024
5800a75
fix: show detail message while create catalog (#497)
foxty Sep 24, 2024
73587ae
Updated Delta 3.2.1 to new RC (#505)
tdas Sep 24, 2024
b082591
Basic server access control (#378)
creechy Sep 25, 2024
92caa9a
Fix duplicate index name. (#492)
creechy Sep 25, 2024
4d95e04
Change wording on successful identity provider validation. (#509)
creechy Sep 25, 2024
bba46e1
support create external delta table (#473)
cloud-fan Sep 26, 2024
18a2c88
Include Rapidstart.md to run Spark, MLflow, and Auth integration end-…
dennyglee Sep 26, 2024
7c911dc
revert: remove psql jar and hibernate interface for container
tnightengale Sep 30, 2024
1167920
docs: update README, add docker_quickstart.md, touch rapidstart.md
tnightengale Sep 30, 2024
060f0ab
chore: remove unused docker/
tnightengale Sep 30, 2024
681d8e1
fix: docker compose volumes and home
tnightengale Sep 30, 2024
ba23041
fix: delete .env
tnightengale Sep 30, 2024
fe20389
fix ui readme image link (#514)
jamieknight-db Sep 26, 2024
5196beb
Mlflow docs patch (#515)
dennyglee Sep 26, 2024
748f0b7
Add access control to temporary credentials endpoints. (#511)
creechy Sep 26, 2024
40e69da
Remove delta 3.2.1 prerelease resolver, official release is out (#518)
bijanhoule Sep 26, 2024
aa45f2f
Remove PATH_REFRESH as it is an unused path operation (#520)
rameshchandra Sep 27, 2024
fe3a386
Update UC docs (#526)
dennyglee Sep 28, 2024
66acfb5
Fix javadoc generation issue (#519)
bijanhoule Sep 28, 2024
1980cc9
Include UI and auth docs (#528)
dennyglee Sep 29, 2024
e208cf3
Feat/delete model version (#517)
jamieknight-db Sep 30, 2024
7584f72
Update docs to include CLI managing users (#530)
dennyglee Sep 30, 2024
32bcd8d
Update auth.md (#531)
dennyglee Sep 30, 2024
4b0108e
docs: put docker in new docs
tnightengale Oct 1, 2024
2dd4c23
revert: add back ui service in compose.yaml
tnightengale Oct 1, 2024
9a3e065
docs: touch README
tnightengale Oct 1, 2024
c7c35dc
Update docs by running `build/sbt generate` (#534)
vikrantpuppala Oct 1, 2024
94959e5
Merge branch 'main' into teghan/build_one_dockerfile
tnightengale Oct 2, 2024
cf58faa
revert: git merge error
tnightengale Oct 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions .github/workflows/docker-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
name: Build and Push Docker Images

on:
push:
branches:
- main
tags:
- 'v*'
pull_request:
branches:
- main

env:
IMAGE: ${{ github.repository }} # eg. unitycatalog/unitycatalog

jobs:
build-and-push:
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up QEMU
uses: docker/setup-qemu-action@v3

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Apply metadata and tags
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.IMAGE }}

# TODO: Agree on a strategy for publishing to DockerHub
# - name: Login to Docker Hub
# uses: docker/login-action@v3
# with:
# username: ${{ vars.DOCKERHUB_USERNAME }}
# password: ${{ secrets.DOCKERHUB_TOKEN }}

- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
push: false # TODO: When we are ready to publish, change this to: ${{ github.event_name != 'pull_request' }}
platforms: linux/amd64,linux/arm64
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}

# TODO: Agree on a caching strategy
# NOTE: See here for an explanation of registry caching: https://tinyurl.com/4b8mfexr
# cache-from: type=registry,ref=${{ env.IMAGE }}:buildcache
# cache-to: type=registry,ref=${{ env.IMAGE }}:buildcache,mode=max
173 changes: 43 additions & 130 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,147 +1,60 @@
ARG unitycatalog_uid=185
ARG unitycatalog_home="/opt/unitycatalog"
ARG unitycatalog_repo="${unitycatalog_home}/repo"
ARG unitycatalog_jar="server/target"
ARG unitycatalog_jars="${unitycatalog_home}/jars"
ARG unitycatalog_etc="etc"
ARG unitycatalog_bin="bin"
ARG unitycatalog_user_name="unitycatalog"
ARG unitycatalog_user_home="home"
ARG unitycatalog_user_basedir="${unitycatalog_home}/${unitycatalog_user_home}"
# Specify any custom parameters necessary to generate
# the Uber-Jar by SBT.
# Note: The default allocated heam memory size is too small
# and will cause the process to fail when attempting to compile
# and generate the Uber-Jar. Therefore it is important to choose
# a size large enough for the compiler to run.
ARG sbt_args="-J-Xmx2G"
# FIXME Pass it from the outside
ARG unitycatalog_version="0.2.0-SNAPSHOT"
ARG jars_directory="server/target/jars"
# syntax=docker.io/docker/dockerfile:1.7-labs
ARG HOME="/opt/unitycatalog"
ARG ALPINE_VERSION="3.20"

FROM eclipse-temurin:17-jdk-alpine AS package_server
# Build stage, using Amazon Corretto jdk 17 on alpine with arm64 support
FROM amazoncorretto:17-alpine${ALPINE_VERSION}-jdk as base

ARG unitycatalog_repo
ARG sbt_args
ARG jars_directory
# Dependencies are installed in $HOME/.cache by sbt
ARG HOME
ENV HOME=$HOME

# Install required packages
RUN <<EOF
set -ex;
apk update;
apk upgrade;
apk add bash git;
rm -R /var/cache/apk/*;
EOF
WORKDIR $HOME

WORKDIR "${unitycatalog_repo}"
COPY --parents build/ project/ examples/ server/ api/ version.sbt build.sbt ./

COPY . .
RUN apk add --no-cache bash && ./build/sbt -info clean package

# Builds server/target/unitycatalog-server-0.2.0-SNAPSHOT.jar
RUN build/sbt ${sbt_args} server/package
# Small runtime image
FROM alpine:${ALPINE_VERSION} as runtime

# As a side effect all the required jars to build the UC server
# are listed in server/target/classpath that is generated at build time
# Copy the jar files into a single directory
RUN ./docker/copy_jars_from_classpath.sh ${jars_directory}
# Specific JAVA_HOME from Amazon Corretto
ARG JAVA_HOME="/usr/lib/jvm/default-jvm"
ARG USER="unitycatalog"
ARG HOME

FROM eclipse-temurin:17-jdk-alpine AS build_uc
# Copy Java from base
COPY --from=base $JAVA_HOME $JAVA_HOME

ARG unitycatalog_uid
ARG unitycatalog_home
ARG unitycatalog_repo
ARG unitycatalog_jar
ARG unitycatalog_jars
ARG unitycatalog_etc
ARG unitycatalog_bin
ARG unitycatalog_user_name
ARG unitycatalog_user_home
ARG unitycatalog_user_basedir
ARG sbt_args
ARG unitycatalog_version
ARG jars_directory
ENV HOME=$HOME \
JAVA_HOME=$JAVA_HOME \
PATH="${JAVA_HOME}/bin:${PATH}"

EXPOSE 8080 8081
# Copy build artifacts from base stage
COPY --from=base --parents \
$HOME/examples/ \
$HOME/server/ \
$HOME/api/ \
$HOME/target/ \
$HOME/.cache/ \
/

# Create a service user with read and execute permissions and write permissions of the ./etc directory
RUN <<EOF
set -ex;
apk update;
apk upgrade;
apk add bash;
rm -R /var/cache/apk/*;
EOF

# Define the shell used within the container
SHELL ["/bin/bash", "-i", "-c", "-o", "pipefail"]

ENV UC_SERVER_BIN="${unitycatalog_home}/${unitycatalog_bin}/start-uc-server"

RUN <<-EOF
set -ex;
mkdir -p "${unitycatalog_jars}";
mkdir -p "${unitycatalog_home}/${unitycatalog_etc}";
mkdir -p "${unitycatalog_home}/${unitycatalog_bin}";
mkdir -p "${unitycatalog_home}/${unitycatalog_user_home}";
EOF

# Create system group and user for Unity Catalog
# Ensure the user created has their HOME pointing to the volume
# created to persist user data and the sbt cached files that
# are created as a result of compiling the unity catalog.
# This also ensures that the container can run independently from
# the storage, so we can have ephemeral docker instances with --rm
# and still be able to run the unity catalog each time without problems.
RUN <<-EOF
#!/usr/bin/env bash
set -ex;
addgroup --system --gid "${unitycatalog_uid}" "${unitycatalog_user_name}";
adduser --system --uid "${unitycatalog_uid}" \
--ingroup "${unitycatalog_user_name}" \
--home "${unitycatalog_user_basedir}" \
--shell "$(/usr/bin/env bash)" \
"${unitycatalog_user_name}";
apk add --no-cache bash
addgroup -S $USER
adduser -S -G $USER $USER
chmod -R 550 $HOME
mkdir -p $HOME/etc/
chmod -R 770 $HOME/etc/
chown -R $USER:$USER $HOME
EOF

# Define volume to persist Unity Catalog data
VOLUME "${unitycatalog_home}"

WORKDIR "$unitycatalog_home"

COPY --from=package_server "${server_jar}" "${unitycatalog_jars}/"
USER $USER

# Copy the etc folder which contains the config files and the data folder
COPY --from=package_server "${unitycatalog_repo}/${unitycatalog_etc}" "${unitycatalog_home}/${unitycatalog_etc}/"

COPY --from=package_server "${unitycatalog_repo}/${jars_directory}" "${unitycatalog_jars}/"

# Create the script that executes the server
# FIXME It could be already created and simply copied over
COPY <<-"EOF" "${UC_SERVER_BIN}"
#!/usr/bin/env bash

SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
ROOT_DIR="$(dirname "$SCRIPT_DIR")"

SERVER_CLASS_NAME="io.unitycatalog.server.UnityCatalogServer"
# A classpath entry consisting simply of * expands to a list of all the jar files in the current directory.
# A class path entry that contains * will not match class files.
SERVER_JAVA_COMMAND="java -cp "jars/classes:jars/*" ${SERVER_CLASS_NAME} $@"

cd ${ROOT_DIR}

exec ${SERVER_JAVA_COMMAND}
EOF

# Set ownership of directories and Unity Catalog home directory to a less
# priviledged user
RUN <<-"EOF"
#!/usr/bin/env bash
set -ex;
chown -R "${unitycatalog_user_name}":"${unitycatalog_user_name}" "$unitycatalog_home";
chmod u+x "$UC_SERVER_BIN";
EOF
# Copy remaining directories here for caching optimization
COPY --chown=$USER:$USER --parents bin/ etc/ $HOME/

USER "${unitycatalog_user_name}"
WORKDIR $HOME

ENTRYPOINT ["/bin/bash", "bin/start-uc-server"]
CMD ["./bin/start-uc-server"]
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,9 @@ Let's take Unity Catalog for spin. In this guide, we are going to do the followi
An example project is provided to demonstrate how to use the UC SDK for various assets
as well as provide a convenient way to explore the content of any UC server implementation.

> If you prefer to run Unity Catalog in Docker use `docker
> compose up`. See the [Docker Quickstart](./docs/docker_quickstart.md) for more details.

### Prerequisites

You have to ensure that your local environment has the following:
Expand All @@ -65,7 +68,6 @@ You have to ensure that your local environment has the following:
- Ensure the `JAVA_HOME` environment variable your terminal is configured to point to JDK17.
- Compile the project using `build/sbt package`

> If you prefer to run this using the Unity Catalog Dockerized Environment, please refer to the Docker [README.md](./docker/README.md)

### Run the UC Server

Expand Down
2 changes: 0 additions & 2 deletions api/Models/UpdateModelVersion.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@

| Name | Type | Description | Notes |
|------------ | ------------- | ------------- | -------------|
| **full\_name** | **String** | The full name of the registered model to update. | [optional] [default to null] |
| **version** | **Long** | Version number of the version to update. | [optional] [default to null] |
| **comment** | **String** | User-provided free-form text description. | [optional] [default to null] |

[[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
Expand Down
1 change: 0 additions & 1 deletion api/Models/UpdateRegisteredModel.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@

| Name | Type | Description | Notes |
|------------ | ------------- | ------------- | -------------|
| **full\_name** | **String** | The full name of the registered model to update. | [optional] [default to null] |
| **comment** | **String** | User-provided free-form text description. | [optional] [default to null] |
| **new\_name** | **String** | New name for the model. | [optional] [default to null] |

Expand Down
10 changes: 0 additions & 10 deletions api/all.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2083,9 +2083,6 @@ components:
UpdateRegisteredModel:
type: object
properties:
full_name:
description: The full name of the registered model to update.
type: string
comment:
description: User-provided free-form text description.
type: string
Expand Down Expand Up @@ -2133,13 +2130,6 @@ components:
UpdateModelVersion:
type: object
properties:
full_name:
description: The full name of the registered model to update.
type: string
version:
description: Version number of the version to update.
type: integer
format: int64
comment:
description: User-provided free-form text description.
type: string
Expand Down
29 changes: 29 additions & 0 deletions compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: unitycatalog

services:

server:
build:
context: .
dockerfile: Dockerfile
ports:
- "8080:8080"
volumes:
- type: bind
source: ./etc/conf
target: /opt/unitycatalog/etc/conf
- type: volume
source: unitycatalog_data
target: /opt/unitycatalog/etc/data
ui:
build:
context: ui/
dockerfile: Dockerfile
ports:
- "3000:3000"
depends_on:
- server

volumes:
# Persist docker volume across container restarts
unitycatalog_data:
Loading